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SPECIFICATION 

KIDNEY TOXICITY PREDICTIVE GENES 



Cross Reference to Other Patent Applications 

[01] This application claims priority from U.S. provisional application Serial No. 

60/361,128 titled "Kidney Toxicity Predictive Genes", on February 27, 2002, which is 
hereby incorporated by reference in its entirety. 

Reference to a Sequence Listing and Tables 

[02] This application contains a gene sequence listing and four tables submitted 

on a compact disc whose file name is 'Tables for Burning", created on February 27, 
2003, containing 5 files and is herein incorporated by reference in its entirety. The 
five files are (a) a gene sequencing Table 32 (403 KB), in Microsoft® Word®, (b) 
Table 38 (785 KB) in Microsoft Excel®, (c) Table 39 (957KB) in Excel, (d) Table 40 
(992 KB) in Excel, and (e) Table 45 (57KB) in Excel. 

Background of the Invention 

[03] This invention is the field of toxicology. More specifically, it relates to kidney 

toxicity predictive genes and the methods of using such genes to predict kidney 
toxicity. Molecular biology and genomics technologies have potential to create 
dramatic advances and improvements for the science of toxicology as for other 
biological sciences. See, for example, MacGregor, et al. Fund, Appl. Tox. 26:1 56- 
173, 1995; Rodi et al., Tox. Pathology 27:1 07-1 10, 1999; Cunningham et al., Ann. 
A/.V. Acad Sci 919: 52-67, 2000; Pritchard et al., Proc. Natl. Acad. Sci. USA 
98:13266-13271, 2001; and Fielden and Zacharewski, Tox. Sciences 60: 6-10, 2001. 
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The advantage of these technologies is that they can provide massive amounts of 
parallel information and that this information concerns processes and events 
occurring at the molecular level. This level of information is in dramatic contrast to 
conventional safety assessment toxicology that, to a large extent, currently relies on 
subjective evaluation (e.g., in-life observations of behavior, observations of gross 
abnormalities at necropsy and histopathological examination of stained tissue slides 
using a microscope). These current methodologies may be largely subjective and in 
some cases such as histopathological evaluation, they require someone with a high 
degree of training, experience and skill to make competent evaluations. 
Furthermore, many of the methodologies require access to organs and tissues that 
necessitates either killing laboratory animals or surgery to obtain tissue specimens. 

Recently, there have been some initial efforts to apply molecular biology and 
genomics technologies to toxicology. Some efforts have involved application of gene 
expression measurements. See, for example, U.S. Patent 6,228,589 and WO 
01/05804. Analysis of the data has yielded interesting observations of gene 
expressions that appear to correlate with some toxic effects or mechanisms. See, for 
example, Mueller et al. Environmental Health Perspectives 106(5): 277-230 (1998). 
However, there has been very little published work in toxicology so far that applies 
rigorous analytical and statistical techniques to the massive amounts of data 
available from genomics technologies. The observations, so far, have tended to be 
phenomenological and focused on individual gene responses rather than determining 
the generally applicable capabilities of patterns of gene expression to predict toxic 
effects (see, for example, studies of gene expression altered by exposure to kidney 
toxicants in Bartosiewicz et al., J. Pharm. Exp. Ther. 297: 895-905, 2001; Lieberthal, 
Curr. Opin. Nephrol. Hypertens 7:289-295, 1998; Huang et al., Tox. Sciences 63: 
196-207, 2001). Even in the larger field of biological sciences, these types of 
analyses are just beginning to be evidenced in the literature {e.g., Golub et al., 
Science 286: 531-537, 1999). 

What is needed are genes and predictive models, which are capable of 
predicting toxicity response. 
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[06] 



Brief Summary of the Invention 

The invention provides kidney toxicity predictive genes and predictive models 
which are useful to predict toxic responses to one or more agents. 



[07] 



In one aspect, the invention provides methods of predicting kidney toxicity in 



an individual exposed to an agent which include the steps of: (a) obtaining a 
biological sample from an individual treated with the agent or treating a biological 
sample obtained from an individual with the agent or treating in vitro cultured cells or 
explants with the agent; (b) obtaining a gene expression profile from the biological 
sample or in vitro cultured cells or explants; and (c) using the gene expression profile 
from the biological sample or cells treated with the agent as a test set and a database 
of gene expression profiles and toxicity classifications as a training set and using 
kidney toxicity predictive genes and a Predictive Model to determine whether the 
agent will induce kidney toxicity in the individual or would be predicted to produce 
kidney toxicity following in vivo exposure. 

[08] In one embodiment, the predictive model utilizes expression profiles from 

sets of kidney toxicity predictive gene(s) selected from Combination 6, infra, wherein 
the set is one or more kidney toxicity predictive gene(s). In other embodiments, the 
predictive model utilizes expression profiles from sets of one or more kidney toxicity 
predictive gene(s) selected from Combination 5, 4, 3, 2, or 1 , wherein the set is one 
or more kidney toxicity predictive gene(s). 

[09] In another aspect, the invention provides methods for determining the 

presence or absence of a no-observable effect level (NOEL) of an agent by the steps 
of: (a) obtaining biological samples from individuals treated with the agent at different 
dose levels or treating a biological sample obtained from an individual with different 
dose levels of the agent or treating in vitro cultured cells or explants with different 
dose levels of the agent; (b) obtaining gene expression prof iles of the samples; and 
(d) using the gene expression profile from the biological samples as a test set and a 
database of gene expression profiles and toxicity classifications as a training set and 
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using kidney toxicity predictive genes and a Predictive Model to determine or predict 
whether and at which dose levels the agent will induce kidney toxicity. In one 
embodiment, the predictive model utilizes expression profiles from sets of kidney 
toxicity predictive gene(s) selected from Combination 6, infra, wherein the set is one 
or more kidney toxicity predictive gene(s). In other embodiments, the predictive 
model utilizes expression profiles from sets of one or more kidney toxicity predictive 
gene(s) selected from Combination 5, 4, 3, 2, or 1 , wherein the set is one or more 
kidney toxicity predictive gene(s). 

[10] In another embodiment, the predictive genes and models may be used with 

an in vitro system to identify in vitro systems that can be used to accurately predict in 
vivo toxicity and to use the identified in vitro systems to accurately predict in vivo 
toxicity. 

[11] in another aspect, the invention provides methods of identifying a kidney 

toxicity predictive gene in an individual including the steps of: (a) providing a set of 
candidate toxicity predictive genes; (b) evaluating said genes for their predictive 
performance with at least one training and test set of data in a predictive model to 
identify genes which are predictive of kidney toxicity; and (c) testing the performance 
of predictive genes for their ability to predict kidney toxicity for different training and 
test sets of data, for prediction of accurate compared to random classification and 
prediction of test data external to the data used to derive the predictive genes, in one 
embodiment, the candidate toxicity predictive genes are rat toxicity genes. 

[12] In another aspect, the invention provides methods for determining the 

presence or absence of a no-observable effect level (NOEL) of an agent by the steps 
of: (a) obtaining biological samples from individuals treated with the agent at different 
dose levels or treating a biological sample obtained from an individual with different 
dose levels of the agent or treating in vitro cultured cells or explants with different 
dose levels of the agent; (b) obtaining gene expression profiles of the samples; and 
(d) using the gene expression profile from the biological samples as a test set and a 
database of gene expression profiles and toxicity classifications as a training set and 
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using kidney toxicity predictive genes and a Predictive Model to determine or predict 
whether and at which dose levels the agent will induce kidney toxicity. In one 
embodiment, the predictive model utilizes expression profiles from sets of kidney 
toxicity predictive gene(s) selected from Combination 6 t infra, wherein the set is one 
or more kidney toxicity predictive gene(s). In other embodiments, the predictive 
model utilizes expression profiles from sets of one or more kidney toxicity predictive 
gene(s) selected from Combination 5, 4, 3, 2, or 1 , wherein the set is one or more 
kidney toxicity predictive gene(s). 

[13] In another aspect, the invention provides a computer program product which 

includes a set of kidney toxicity predictive genes derived from mining a database 
having a plurality of gene expression profiles indicative of toxicity, in one 
embodiment, the set of kidney toxicity predictive genes includes at least one toxicity 
predictive gene from combination 6, 5, 4, 3, 2, or 1 list. 

[14] In another aspect, the invention provides a library of information about kidney 

toxicity predictive genes produced by the methods disclosed herein. 

[15] In another aspect, the invention provides an integrated system for predicting 

kidney toxicity comprising: an array reader modified to read gene expression profiles 
from biological samples exposed to a test agent, operably linked to a computer 
comprising a database file having a plurality of kidney toxicity predictive genes. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[16] Figure 1 is a flow diagram illustrating the identification of kidney toxicity 

predictive genes. The pathway is given for discovery of kidney toxicity predictive 
genes using the database of expression array data (Rat CT array) and toxicity data 
for kidney samples from rats treated with various compounds (see Table 1). Gene 
with expressions correlating with pathology were determined using a variety of 
correlation statistics (see for example Tables 2 and 3). Predictive model used was 
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the GeneSpring Predict Parameter Value model that employs a K-nearest neighbor 
model. 

[17] Figure 2 is a graph which shows the percent of overall correct calls as a 

function of the number of predictivity genes using histopathology correlating genes 
(Pearson measure) as the input gene list with Training and Test Set A. The percent 
of overall correct calls is presented as a function of the number of kidney toxicity 
predictivity. genes. The input genes list consisted of 66 genes that showed a 
statistically significant correlation with the histopathology scores using Pearson's 
correlation measure (r-value >0.4). Training and Test Set A was used with other 
model values of 10 nearest neighbors and a p-value ratio cutoff of 0.5. An optimum 
gene number of 49 was observed (lowest number of genes giving the highest percent 
overall calls) for this case. 

[18] Figure 3 is a flow diagram illustrating how kidney toxicity predictive genes are 

evaluated for performance. Performance of predictive model is evaluated using 6 
sets of training and test data (Rat CT expression array data). The training and test 
sets have accurate classification assignments (histopathology "yes" or "no" for each 
sample) or random classifications assignments ("yes" and "no" randomly assigned to 
samples). The K-nearest neighbor model is used with input being lists of predictive 
genes, as indicated, and the training and test set data. Four different measures of 
prediction are considered as indicated. 

[19] Figure 4 is a graph that shows the cumulative predictive performance of 

Combo 6 genes. The mean, minimum and maximum percent accuracy for 6 training 
and test sets are presented for Combo 6 genes that were used cumulatively in the 
order given in Table 14. 

[20] Figure 5 is a graph that shows the cumulative predictive performance of 

Combo 5 genes. The mean, minimum and maximum percent accuracy for 6 training 
and test sets are presented for Combo 5 genes that were used cumulatively in the 
order given in Table 14. 
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[21] Figure 6 is a graph that shows the cumulative predictive performance of 

Combo 4 genes. The mean, minimum and maximum percent accuracy for 6 training 
and test sets are presented for Combo 4 genes that were used cumulatively in the 
order given in Table 14. 

[22] Figure 7 shows the k-means and tree cluster analysis of Combo 6 genes. 

[23] Figure 8 shows the Wards cluster analysis of Combo 6 gene set. 

[24] Figure 9 shows a scanned autoradiogram of a Western blot of serum 

samples from 8 animals probed with antibodies to clusterin and insulin-like growth 
factor binding protein 1 . Sample information is indicated in the figure. The figure 
also presents transcriptional differential expression levels of the insulin-like growth 
factor binding protein 1 gene observed in kidney samples from these animals. 

BRIEF DESCRIPTION OF THE TABLES 

[25] Table 1 lists the compounds, dose levels, kidney pathology and 

abbreviations in the database. 

[26] Table 2 lists genes whose expression at 24h directly correlates with kidney 

tubular necrosis at 72h, ranked by Pearson correlation coefficient. 

[27] Table 3 lists genes whose expression at 24h inversely correlates with kidney 

tubular necrosis at 72h, ranked by Spearman correlation coefficient. 

[28] Table 4 lists the distribution of compounds in individual training and test sets 

for 24 hour kidney data. 

[29] Table 5 lists the predictive genes for 24 hour expression data. 

[30] Table 6 lists the randomly selected gene subsets from 24 hour combo all 

(216 genes). 



WO 03/100030 



PCTYUS03/06196 



[31] Table 7 lists the randomly selected gene subsets from 24 h combo 6 gene 

set (28 genes). 

[32] Table 8 lists the randomly selected gene subsets from 24 h combo 5 gene 

set (25 genes). 

[33] Table 9 lists the randomly selected gene subsets from 24 h combo 4 gene 

set (23 genes). 

[34] Table 10 lists the randomly selected gene subsets from array genes 

excluding combo all set. 

[35] Table 1 1 lists the kidney toxicity individual sample prediction values for 24 

hour data predictive genes (combined list and subsets). 

[36] Table 1 2 lists the kidney toxicity compound-dose prediction values for 24 

hour data predictive genes (combined list and subsets). 

[37] Table 13 lists the kidney toxicity compound prediction values for 24 hour data 

predictive genes (combined list and subsets). 

[38] Table 14 lists the order of genes used for cumulative analysis of predictive 

performance of predictive combo gene sets. 

[39] Table 15 lists the individual gene predictions for combo 6. 

[40] Table 16 lists the individual gene predictions for combo 5. 

[41] Table 17 lists kidney toxicity individual sample prediction values for 24 hour 

data with random gene subsets. 

[42] Table 1 8 lists the comparison of predictivity for true kidney toxicity 

classification and random classification using combo gene sets and random subsets 
and 24 hour data. 

[43] Table 19 lists the distribution of compounds in individual training and test 
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sets for 6 hour kidney data. 

[44] Table 20 lists the genes whose expression at 6 hours directly correlates with 

kidney tubular necrosis at 72 hours, ranked by Pearson correlation coefficient. 

[45] Table 21 lists the genes whose expression at 6 hours inversely correlates 

with kidney tubular necrosis at 72 hours, ranked by Spearman correlation coefficient, 

[46] Table 22 lists the genes whose expression at 6 hours is predictive of kidney 

toxicity at 72 hours. 

[47] Table 23 lists the kidney toxicity compound-dose prediction values for 6 hour 

data predictive genes (combined list and subsets). 

[48] Table 24 lists the distribution of compounds in individual training and test 

sets for the 72 hour kidney data. 

[49] Table 25 lists the genes whose expression at 72 hours directly correlates 

with kidney tubular necrosis at 72 hours, ranked by Pearson correlation coefficient. 

[50] Table 26 lists the genes whose expression at 72 hours inversely correlates 

with kidney tubular necrosis at 72 hours, ranked by Spearman correlation coefficient, 

[51] Table 27 lists the genes whose expression at 72 hours is predictive of kidney 

toxicity at 72 hours. 

[52] Table 28 lists the kidney toxicity compound-dose prediction values for 72 

hour data predictive genes (combined list and subsets). 

[53] Table 29 lists the predictive performance of various models. 

[54] Table 30 lists the logistic discrimination coefficients. 

[55] Table 31 lists the prediction of kidney toxicity for samples external to 

database. 

[56] Table 32 lists the genes predictive for kidney tubular necrosis, sequences, 
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and accession numbers. 



[57] Table 33 lists the kidney predictive genes (376 genes) organized by time 

point and combo category. 

[58] Table 34 lists the RCT genes (ESTs) predictive for kidney tubular necrosis: 

best homology matches. 

[59] Table 35 lists the genes that are predictive at all three time points. 

[60] Table 36 lists the genes that are the most predictive across the time points. 



Table 37 lists the kidney toxicity predictive genes whose protein products are 
known to be secreted. The genes are from the table listing all the kidney predictive 
genes at the three time points 6, 24 and 72 hours. The protein products are easier to 
access since they are secreted into body fluids and are thus more amenable to be 
quantified. Therefore these proteins could be monitored in body fluids of subjects 
such as humans and toxicity predictions could be made. 



[62] Table 38 lists the expression data for the 6 hour timepoint. 

[63] Table 39 lists the expression data for the 24 hour timepoint. 

[64] Table 40 lists the expression data for the 72 hour timepoint. 

[65] Table 41 lists the predictive performance of predictive genes organized by 

occurrence on training/test set lists (combo number) and time point. 

[66] Table 42 lists the summary output of the predictive computer software 

product. 

[67] Table 43 lists the detailed output of the predictive computer software product. 

[68] Table 44 lists protein marker candidate identification information that includes 

the gene name, % correct calls, average fold induction for negative histopathology 
samples, and average fold induction for positive histopathology samples. 
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[69] Table 45 lists input data used for the predictive computer program product. 

DETAILED DESCRIPTION OF THE INVENTION 

[70] This invention relates to methods of predicting whether an agent or other 

stimulus is capable of inducing kidney toxicity in a recipient organism using predictive 
molecular toxicology analysis. In particular, the invention provides methods of 
predicting kidney toxicity that comprise analyzing gene and/or protein expression 
across a number of kidney toxicity biomarkers disclosed herein for patterns of 
expression that correlate with and are predictive of kidney tubule necrosis in the 
recipient organism. This endpoint is significant because mortality in patiehts is high 
for acute renal failure and tubular necrosis is associated with many causes such as 
ischemia, endotoxemia or exposure to nephrotoxins (Ueda et al., Am. J. Med. 108: 
403-415, 2000). 

[71] The invention is based, in part, upon the discovery that modulated 

transcriptional regulation of relatively small sets of certain genes in response to a test 
agent can accurately predict the occurrence of kidney toxicity observed at later time 
points. 

[72] Provided herein are multiple sets of kidney toxicity biomarkers which are 

useful in the practice of the kidney toxicity prediction methods of the invention. In 
particular, applicants have identified 376 kidney toxicity biomarkers that demonstrate 
utility in predicting kidney toxicity outcomes. These biomarkers have been thoroughly 
characterized for their predictive performance, individually as well as in various 
combinations or subsets thereof. In addition, various optimized subsets of the kidney 
toxicity biomarkers of the invention are disclosed, which sets have also been 
thoroughly characterized for predictive performance using the methods of the 
invention. Among the subsets of kidney toxicity genes provided herein are several 
which demonstrate prediction accuracies in the vicinity of 95%. 
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[73] 



The invention is further described by way of the experimental examples 



provided herein. These examples demonstrate that small sets of genes (i.e., in some 
instances, as few as 2 or 3 biomarker genes) may be used to accurately predict 
kidney toxicity. For example, as further described in the Examples, analysis of 
mRNA expression of only a few genes can provide an accurate indication of whether 
a test agent will or will not induce kidney toxicity. 

[74] The predictive capacity of the methods of the invention have been verified by 

(a) comparisons with random classifications, and (b) predictions using data external 
to the database used to identify the kidney toxicity biomarkers. Moreover, the 
methods of the invention are capable of distinguishing between agent dose levels 
which induce toxicity (typically higher doses) and those doses that are non-toxic. 
This latter feature is an essential component of meaningful toxicological evaluation.. 

[75] I. General Techniques: The practice of the present invention will employ, 

unless otherwise indicated, conventional techniques of molecular biology (including 
recombinant techniques), microbiology, cell biology, biochemistry, nucleic acid 
chemistry, and immunology, which are well known to those skilled in the art.. Such 
techniques are explained fully in the literature, such as, Molecular Cloning: A 
Laboratory Manual, second edition (Sambrook et al., 1989) and Molecular Cloning: A 
Laboratory Manual, third edition (Sambrook and Russel, 2001), flointly referred to 
herein as "Sambrook"); Current Protocols in Molecular Biology (F.M. Ausubel et al., 
eds., 1987, including supplements through 2001); PCR: The Polymerase Chain 
Reaction, (Mullis et al., eds., 1994); Harlow and Lane (1988) Antibodies, A 
Laboratory Manual, Cold Spring Harbor Publications, New York; Harlow and Lane 
(1999) Using Antibodies: A Laboratory Manual Cold Spring Harbor Laboratory Press, 
Cold Spring Harbor, NY (jointly referred to herein as "Harlow and Lane"), Beaucage 
et al. eds., Current Protocols in Nucleic Acid Chemistry John Wiley & Sons, Inc., New 
York, 2000) and Casarett and Doull's Toxicology The Basic Science of Poisons, C. 
Klaassen, ed., 6th edition (2001). 

[76] II. Definitions: Unless otherwise defined, all terms of art, notations and other 
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scientific terminology used herein are intended to have the meanings commonly 
understood by those of skill in the art to which this invention pertains. In some cases, 
terms with commonly understood meanings are defined herein for clarity and/or for 
ready reference, and the inclusion of such definitions herein should not necessarily 
be construed to represent a substantial difference over what is generally understood 
in the art. The techniques and procedures described or referenced herein are 
generally well understood and commonly employed using conventional methodology 
by those skilled in the art, such as, for example, the widely utilized molecular cloning 
methodologies described in Sambrook et al., Molecular Cloning: A Laboratory 
Manual 2nd edition (1989) Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 
N.Y. As appropriate, procedures involving the use of commercially available kits and 
reagents are generally carried out in accordance with manufacturer defined protocols 
and/or parameters unless otherwise noted. 

[77] Toxic" or "toxicity" refers to the result of an agent causing adverse effects, 

usually by a xenobiotic agent administered at a sufficiently high dose level to cause 
the adverse effects. 

[78] As used herein, the terms "kidney toxicity biomarker" and "kidney toxicity 

predictive gene" are used interchangeably and refer to a gene whose expression, 
measured at the RNA or protein level can predict the likelihood of a kidney toxicity 
response with accuracy significantly better than would occur by chance. In one 
embodiment, the kidney toxicity response is tubular necrosis. In other embodiments, 
the kidney toxicity response can be other toxicity manifestations that elicit similar 
detectable gene expression changes. These could include other forms of tubular 
injury, glomerular toxicity and papillary injury. 

[79] A loxicological response" refers to a cellular, tissue, organ or system level 

response to exposure to an agent. At the molecular level, this can include, but is not 
limited to, the differential expression of genes encompassing both the up- and down- 
regulation of expression of such genes at the RNA and/or protein level; the up- or 
down-regulation of expression of genes which encode proteins associated with 
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response to and mitigation of damage, the repair or regulation of ceil damage; or 
changes in gene expression due to changes in populations of cells in the tissue or 
organ affected in response to toxic damage. 

[80] An "agent" or "compound" is any element to which an individual can be 

exposed and can include, without limitation, drugs, pharmaceutical compounds, 
household chemicals, industrial chemicals, environmental chemicals, other 
chemicals, and physical elements such as electromagnetic radiation. 

[81] The term "biological sample" as used herein refers to substances obtained 

from an individual. The samples may comprise cells, tissue, parts of tissues, organs, 
parts of organs, or fluids (e.g., blood, urine or serum). Biological samples include, 
but are not limited to, those of eukaryotic, mammalian or human origin. 

[82] "Sample" is defined for the purposes of prediction as a biological sample and 

the gene expression data for that sample. Each sample comes from an individual 
animal. A toxicity classification may also be associated with the sample. 

[83] "Gene expression" as used herein refers to the relative levels of expression 

and/or pattern of expression of a gene. In some embodiments, the expression refers 
to a toxicity gene or toxic response gene. In other embodiments, the expression is of 
a toxicity predictive gene. 

[84] "Gene expression profile" refers to the relative levels of expression of 

multiple different genes measured for the same sample. Gene expression profiles 
may be measured in a sample, such as samples comprising a variety of cell types,- 
different tissues, different organs, or fluids (e.g., blood, urine, spinal fluid, sweat, 
saliva or serum) by various methods including but not limited to microarray 
technologies and quantitative and semi-quantitative RT-PCR (e.g., Taqman™) 
techniques, as well as techniques for measuring expression of proteins. 

[85] "Individual" refers to a vertebrate, including, but not limited to, a human, non- 

human primate, mouse, hamster, guinea pig, rabbit, cattle sheep, pig, chicken, and 
dog. 
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[86] As used herein, the terms "hybridize", "hybridizing", "hybridizes" and the like, 

used in the context of polynucleotides, are meant to refer to conventional 
hybridization conditions, such as hybridization in 50% formamide/6X SSC/0.1% 
SDS/100 ng/ml ssDNA, in which temperatures for hybridization are above 37 
degrees Celsius and temperatures for washing in 0,1 X SSC/0.1% SDS are above 55 
degrees Celsius, and preferably to stringent hybridization conditions. Nucleic acids 
will hybridize will depend upon factors such as their degree of complementarity as 
well as the stringency of the hybridization reaction conditions. Stringent conditions 
can be used to identify nucleic acid duplexes with a high degree of complementarity. 
Means for adjusting the stringency of a hybridization reaction are well-known to those 
of skill in the art. See, for example, Sambrook, et a/., "Molecular Cloning: A 
Laboratory Manual," Second Edition, Cold Spring Harbor Laboratory Press, 1989; 
Ausubel, et a/., "Current Protocols In Molecular Biology," John Wiley & Sons, 1996 
and periodic updates; and Hames ef a/. f "Nucleic Acid Hybridization: A Practical 
Approach," IRL Press, Ltd., 1985. In general, conditions that increase stringency 
(/.e., select for the formation of more closely-matched duplexes) include higher 
temperature, lower ionic strength and presence or absence of solvents; lower 
stringency is favored by lower temperature, higher ionic strength, and lower or higher 
concentrations of solvents. 

[87] In the context of amino acid sequence comparisons, the term "identity" is 

used to express the percentage of amino acid residues at the same relative position 
which are the same. Also in this context, the term "homology" is used to express the 
percentage of amino acid residues at the same relative positions which are either 
identical or are similar, using the conserved amino acid criteria of BLAST analysis, as 
is generally understood in the art. Further details regarding amino acid substitutions, 
which are considered conservative under such criteria, are provided. 

[88] III. Identification of Kidney Toxicity Biomarkers 

A. Generation of Toxicology Gene Expression Biomarkers: The kidney toxicity 
biomarkers described herein were initially identified utilizing a database 
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generated from large numbers of in vivo experiments, wherein the differential 
expression of approximately 700 rat genes, measured at various time points, in 
response to multiple toxic compounds inducing various specific toxic responses, 
as visualized through microscopic histopathological analysis, was quantified, as 
described in pending United States Patent Application filed January 29, 2002 
(serial number not yet assigned). This quantitative gene expression data, as 
well as corresponding histopathological information, was then subjected to an 
analytical approach specifically designed to identify genes which not only 
correlated with the observed histopathology, but also demonstrated an ability to 
be used in a model capable of accurately predicting the occurrence of the toxic 
response associated with the observed histopathology. A complete description 
of this identification process is presented in the Examples. A flow diagram 
illustrating how the kidney toxicity biomarkers of the invention were identified is 
presented in Figure 1 . 

In addition to the database described and utilized herein, other toxicology 
gene expression databases may be generated using techniques well known in the 
art, and used to identify additional kidney toxicity biomarkers, which may also be 
employed in the practice of the kidney toxicity prediction methods of the invention. 
Such databases may be generated with test compounds capable of inducing various 
pathologies indicative of a toxic response in the kidney and/or other organs or 
systems, over different time periods and under different administration and/or dosing 
conditions, including without limitation kidney tubule necrosis, glomerular necrosis, 
glomerular sclerosis and papillary injury. An example of compounds, dose levels, 
kidney toxicity classifications and histopathology scores used in the Examples which 
follow is provided in Table 1 . 

Such databases may be generated using organisms other than the rat, 
including without limitation, animals of canine, murine, or non-human primate 
species. In addition, such databases may incorporate data derived from human 
clinical trials and post-approval human clinical experiences. Various methods for 
detecting and quantitating the expression of genes and/or proteins in response to 
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toxic stimuli may be employed in the generation of such databases, as are generally 
known in the art. For example, microarrays comprising multiple cDNAs or 
oligonucleotide probes capable of hybridizing to corresponding transcripts of genes 
of interest may be used to generate gene expression profiles. Additionally, a number 
of other methods for detecting and quantitating.the expression of gene transcripts are 
known in the art and may be employed, including without limitation, RT-PCR 
techniques such as TaqMan®, RNAse protection, branched chain, etc. 

[91] Databases comprising quantitative gene expression information preferably 

include qualitative and quantitative and/or semi-quantitative information respecting 
the observed toxicological responses and other conventional toxicology endpoints, 
such as for example, body and organ weights, serum chemistry and histopathology 
observations, histopathology scores and/or similar parameters. 

[92] B. Identification of Correlating Genes: For the purpose of identifying 

candidate predictive genes, the database preferably includes histopathology scores 
for each animal which has been exposed to one or more agent(s). These scores can 
be assigned based on actual histopathology observations for the tissue and animal or 
on the basis of effects observed for other animals treated with the same agent and 
dose level. The scores are numerical scores that reflect the occurrence and severity 
of histopathological changes. These scores can be adjusted to have similar range to 
gene expression changes. For example, a score of 1 could be assigned to samples 
with no changes and scores of 28 assigned to increasingly severe changes. 
Because the scores are numerical, they are suitable for use with a variety of 
statistical correlation and similarity measures. 

[93] An example of a histopathology scoring system is provided in Example 1 . 

Referring to Figure 1, histopathology scores may be utilized to identify genes which 
correlate with the observed toxicological response, using any number of statistical 
correlation and similarity analysis techniques, including without limitation those 
techniques described or employed in Example 1 (e.g., Pearson, Spearman, change, 
smooth, distance etc.). Such correlating genes may be used as predictive gene 
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candidates. Examples of genes whose expression at 24 hours after treatment 
correlates with histopathology observed at 72h are detailed in Tables 3 and 4. In one 
embodiment, the correlating gene lists as well as the entire array gene list are used 
as input gene lists in the GeneSpring™ Predictive Model (otherwise known hereafter 
as "Predictive Model"). 

(C) Class Prediction and Classification: Statistical analysis of the database of 
gene expression profiles can be effected by utilizing commercially available software 
programs. In one embodiment, GeneSpring™ (Version 4.1, Silicon Genetics, 
Redwood City, CA) is used. Other software programs which can be used for 
statistical analysis include, without limitation, SAS software packages (SAS Institute 
Inc., Cary, NC) and S-PLUS® software (Insightful Corporation, Seattle, WA) 

Using GeneSpring™ software, class predictions can be made from the genes 
in the database, as detailed in Example 1 , using one or more training and test sets. 
In one embodiment, six training sets and six test sets are obtained, as shown in 
Example 1 (Table 4). Kidney toxicological classifications are entered for the samples 
in each training and test set. Toxicological classifications can be defined by various 
pathologies. In one embodiment, the toxicity is defined as kidney tubular necrosis 
observed 72 hours after treatment with an agent. However, toxicity can manifest in 
other nephropathologies such as glomerular necrosis or papillary injury. 

Once the training sets have been selected, then predicted classifications of 
the test set samples are obtained by using k-nearest neighbor (or knn) voting 
procedure. The class of each of the knn is determined and the test sample is 
assigned to the class with the largest representation after adjusting for the proportion 
of classifications in the training set. In one embodiment, adjustments are made to 
account for different proportions of classes in the training set. 

Toxicity can also be observed at various time points after exposure to an 
agent and is not limited to only 72 hour after treatment. A skilled toxicologist can 
determine the optimal time after exposure to an agent to observe pathology by either 
what has been disclosed in the art or a stepwise experimentation with time 
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increments, for example 2, 4, 6, 12, 18, 24, 36, 48 hours post-exposure or even 
longer time increments, for example, days, weeks, or months after exposure to the 
agent. 

[98] (D) Identification of Predictive Genes: Figure 1 describes the overall process 

used to identify kidney toxicity predictive genes. In one embodiment, this process 
was run independently for each time point. 

[99] The number of genes that are to be used in the Predictive Model can be 

varied, for example 50, 40, 30, 20, 10, 5, 2, or 1 gene(s) can be used. In a preferred 
embodiment, at least 50 genes are used. 

[100] An optimal gene list is generated that generates the best predictive accuracy 

with the lowest number of genes used. Figure 2 shows an exemplary profile for an 
optimal gene list. 

[101] In one embodiment, optimum gene lists for all input gene lists are combined 

for each training and test set and then these combined lists for all six training and test 
sets are merged to create an aggregate list of predictive genes. The aggregate list 
can then be subdivided to smaller lists of genes based on the number of times that 
the genes occurred on the predictive gene lists for each individual training or test set. 
These are designated herein as Combo 6, 5, 4, 3, 2, or 1 lists. The genes that were 
predictive in all 6 training and test sets are designated as Combo 6 and the genes 
that were predictive in 5 of 6 training and test sets are designated as Combo 5 and 
so forth. Table 32 presents gene names, accession numbers and sequence 
information for the kidney toxicity predictive genes found by analysis of the database 
in the manner described above. Each of these genes has been demonstrated to 
contribute to predictive performance for at least one input gene list and training/test 
set and one time point. Table 33 lists the kidney toxicity predictive genes organized 
by time point and Combo Class. Table 34 lists homologous genes for the RCT 
sequences that were identified by BLAST search using the GenBank NR database 
as the target database. 
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[102] 



The predictive genes can also be categorized by their occurrence as 



predictive at different time points. Table 35 lists 53 genes that are on the combined 
predictive lists of all three time points tested. This list is derived from the list of all the 
predictive genes measured at 6, 24 and 72 hours that predicted kidney tubular 
necrosis at 72 hours. Genes that are predictive at multiple time points can be further 
grouped by their Combo ranking. Table 36 lists 23 genes that are the most predictive 
across the three time points tested. This list is a subset of the list of 53 genes that 
are predictive across all three time points 6, 24 and 72 hours. The criteria for 
inclusion in this table were that the gene be a member of the highest combinations, 
viz., combinations 6, 5 or 4 in at least 2 out of three time points. The gene 
expression data of the genes in Table 36 could be expected to be very highly 
predictive of kidney tubular necrosis. Further, since the predictive strength of these 
genes is very high across the 3 time points tested, it could be expected that gene 
expression data derived from these genes even at time points not tested such as any 
time points falling between 6 and 72 hours or any other time point would be very 
highly predictive of tubular necrosis. These specific genes could be useful in cases 
where the dose route or pharmacokinetic properties of a compound may alter the 
kinetics of predictive gene expression changes. 

[103] IV. Evaluation of Predictive Genes for Kidney Toxicity: The predictive genes 

are evaluated for predictive performance as shown in Figure 3. For each gene list 
prediction, a table of data was generated using the Predictive Model which included: 
the test set containing information about the actual call (/.e., "yes" or "no" for kidney 
toxicity), the predicted call (/.a, "yes" or "no" for kidney toxicity), and the P-value 
cutoff ratio. Expression data that can be used with the K-nearest neighbor model 
and predictive genes to enable one skilled in the art to make predictions are given in 
Tables 38-40. 

[104] The combined list of predictive genes or alternatively, Combo 6, 5, 4, 3, 2, or 

1 list or subsets thereof was used as input into the Predictive Model. As another 
verification of the predictive abilities of the genes found to be predictive for kidney 
toxicity, random lists of genes were generated and also used as input into the 



20 



WO 03/100030 PCT7US03/06196 



Predictive Model. Example 2 describes the evaluation of the predictive performance 
of the kidney toxicity predictive genes, 

[105] Predictive performance may also be assessed using data from different time 

points after exposure to the agent. In one embodiment, 24 hour expression data is 
used. In another embodiment, 6 hour expression data is used, as described in 
Examples 3 and 4. In another embodiment, 72 hour expression data is used, as 
described in Example 5 and 6. As shown in Table 41, predictive capability for 24 
hour expression data has a high accuracy rate (i.e., 90% accuracy) when the entire 
predictive gene list is used. 



Table 41 Predictive Performance of Predictive Genes Organized by Occurrence on 
Training/Test Set Lists (Combo number) and Time Point 



Time Point 


Gene Set 


Number 
of Genes 


Accuracy** 


Geometric Mean** 


24h 


Combo All 


216 


0.915 (0.861-0.945) 


0.810 (0.720-0.884) 


24 h 


Combo 6 


28 


0.921 (0.867-0.955) 


0.837 (0.660-0.953) 


24 h 


Combo 5 


25 


0.896 (0.829-0.929) 


0.821 (0.684-0.870) 


24 h 


Combo 4 


23 


0.882 (0.829-0.929) 


0.776(0.700-0.925) 


24 h 


Combo 3 


19 


0.839 (0.778-0.911) 


0.740(0.562-0.892) 


24 h 


Combo 2 


45 


0.733 (0.641-0.821) 


0.552 (0.343-0.663) 


24 h 


Combo 1 


76 


0.787 (0.667-0.884) 


0.645 (0355-0.782) 












6h 


Combo All 


176 


0.719 (0.571-0.793) 


0.610 (0.420-0.750) 


6h 


Combo 6 


15 


0.747 (0.567-0.800) 


0.542 (0.000-0.800) 


6h 


Combo 5 


16 


0.536 (0.330-0.700) 


0.480 (0.400-0.650) 


6h 


Combo 4 


19 


0.731 (0.607-0.875) 


0.584 (0.400-0.740) 


6h 


Combo 3 


21 


0.635 (0.330-0.830) 


0.514 (0.350-0.630) 


6h 


Combo 2 


38 


0.6O7 (0.350-0.830) 


0.402 (0.000-0.600) 


6h 


Combo 1 


67 


0.588 (0.420-0.820) 


0.509 (0.390-0.630) 












72 h 


Combo All 


225 


0.882 (0.643-0.974) 


0.747 (0.500-0.913) 


72 h 


Combo 6 


16 


0.808 (0.607-0.902) 


0.601 (0.000-0.869) 


72 h 


Combo 5 


27 


0.742 (0.429-0.921) 


0.616 (0.452-0.803) 


72 h 


Combo 4 


23 


0.828 (0.500-0.917) 


0.607 (0.000-0.839) 


72 h 


Combo 3 


33 


0.705 (0.357-0.902) 


0.414 (0.000-0.649) 


72 h 


Combo 2 


41 


0.661 (0.357-0.868) 


0.412 (0.000-0.690) 


72 h 


Combo 1 


90 


0.783(0.536-0.941) 


0.572 (0.000-0.896) 



** Means and ranges are given for 6 training and test sets. Unit of prediction was the 
animal and the predictive classification was for kidney tubular necrosis observed 
at 72 hours after treatment. Standard prediction measures were used as 
defined in Materials and Methods of Example 1. These include: 
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Accuracy ^Proportion of total number of predictions that are correct 
Geometric mean=Performance measure that takes into account proportion of 
positive and negative cases 

[1 06] Somewhat lower predictive accuracies were observed for the 6h and 72 h 

data but the prediction was still quite significant. In general, selecting genes from 
Combo list 6 for use in prediction of kidney toxicity yields higher average accuracy 
than using genes from Combo list 5 which in turn yields higher average accuracy 
rates than Combo 4 and so forth for Combo lists 3, 2, and 1 . All of the combo lists as 
well as Combo All list had significantly higher accuracy than using random 
classifications. 

[107] Predictive performance may also be assessed using subsets of genes from 

the different Combo lists. As indicated in Examples 2, 4 and 6 randomly selected 
subsets of the Combo gene lists had very good predictive performance (accuracy 
better than 80% and approaching 90%) and even individual genes had mean 
predictive accuracies that were significant (for example, greater than 80%). 
Cumulative performance of subsets of 24 h data is presented in Figures 4-6. In one 
embodiment, using 3 genes from Combo list 6 yields about 90% accuracy. However, 
using different Combo lists may require more genes to reach the same accuracy 
level, e.g., 8 genes from Combo 5 list, 13 genes from Combo 4 list. 

[108] V. Use of kidney toxicity predictive genes: The kidney toxicity predictive 

genes disclosed herein and kidney toxicity predictive genes identified by using 
methods disclosed herein are useful for predicting kidney toxicity in response to 
exposure to one or more agents. 

[1 09] The discovery that relatively small sets of different genes have predictive 

value permits flexible application of these discoveries. The choice of how many and 
which genes to use can be tailored to a variety of different purposes. Very good 
predictivity is observed for sets of a few genes (for example as few as three genes of 
the 24 hour Combo 6 set have mean prediction accuracy of about 90%). These 
small sets may be particularly advantageous in applications where measurement of 
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only a few RNA species has considerable advantages in terms of sample processing 
logistics, speed and cost. These applications would include relatively high 
throughput screens for predictive capability. An example of this would be an early 
screen using small samples of primary cells or cultured cell lines that can be 
processed with automated robotic equipment for treatment and isolation of RNA 
followed by efficient technologies for measuring expression of a few RNA species 
such as branched chain technology or RT-PCR. The use of larger numbers of 
predictive genes provides for redundancy and consequent greater accuracy and 
precision. Applications using larger numbers of predictive genes might be tests of 
candidates at later stages of commercial development. An example would be later 
stages of preclinical development of a therapeutic candidate where in vivo samples 
can be obtained and more comprehensive methods such as microarray 
measurement of gene expression are appropriate. The larger gene sets can also 
include different subsets of genes which may offer more insight into potential 
mechanisms of toxicity and the ability to have refined predictions of long term toxic 
consequences such as chronic, irreversible toxicity or carcinogenicity. 

[1 10] Some members of the kidney toxicity predictive genes may also be suitable 

for prediction of toxicity in other organs or may be preferable for predicting toxicity for 
wider ranges of timepoints or treatment routes or regimens. As an example of the 
latter, some of the predictive genes are observed at three different timepoints after 
treatment. These genes may be useful for prediction in cases where the samples 
come from treatment protocols that have different measurement timepoints or routes 
of administration than those employed for the database or where the toxicokinetics 
for a particular agent are known or suspected to be different from those in the 
database. 

[111] In one embodiment, the agent is an agent for which no expression profile has 

been assessed or stored in the database or library. An animal, e.g., rat, is dosed with 
such an agent and the gene expression profile(s) is the test set for the Predictive 
Model. The training set which is used in the Predictive Model in this case can be the 
entire database of sample array data because the test set data is not present in the 



23 





WO 03/100030 



PCT/US03/06196 



database. As described in Example 8, the prediction can be made with accuracy 
without requiring the use of histopathology scores for the test set as part of the input 
into the Predictive Model. 



used at a different dose level or with a different treatment protocol than used in the 
database. The training set which is used in the Predictive Model in this case can be 
the entire database of sample array data because the test set data is not present in 
the database. As described in Example 8, the prediction can be made with accuracy 
without requiring the use of histopathology scores for the test set as part of the input 
into the Predictive Model. 

[113] In another embodiment, the exposure time of the agent is not 6, 24, or 72 

hours or repeat dosing protocols are used. In this case, the skilled artisan can use 
the toxicity predictive genes from surrounding time points to extrapolate the predicted 
toxicity without undue experimentation. For example, if the individual has been 
exposed to the agent for 12 hours, then predictive genes from 6 and 24 hours 
timepoints are used as guidelines for extrapolating possible predicted toxicity. 

[114] In another embodiment, the kidney predictive genes and predictive model 

can be used to determine the presence or absence of a no-observable toxicity effect 
level (NOEL). An agent can be used at different treatment levels and expression 
profiles obtained for each treatment level. The predictive genes and predictive model 
can be used to determine which dose levels elicit a response that is predicted to be 
toxic and which dose levels are not toxic. In contrast to conventional endpoints for 
determining no-effect levels, the use of expression data, predictive genes and 
predictive models applies a number of quantitative endpoints and criteria instead of 
subjective endpoints and criteria. This permits more rigorous and precisely defined 
determination of no effect levels. 

[115] In another embodiment, the kidney toxicity predictive genes can be used to 

detect toxic effects that may be manifested as long lasting or chronic consequences 
such as irreversible toxicity or carcinogenesis. The predictive genes and model can 



[112] 



In another embodiment the agent is an agent present in the database but is 
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be applied to databases where classifications of training and test set samples are 
made with respect to actual or putative endpoints such as irreversible toxicity or 
carcinogenicity. 



alternative models to predict kidney toxicity. Some of these models do not require 
the direct use of data in a database but use functions or coefficients derived from the 
database. In another embodiment, the predictive genes and models may be used to 
evaluate in vitro systems for their ability to reflect in vivo toxic events and to use such 
in vitro systems for predicting in vivo toxicity. Expression profiles for predictive genes 
can be created from candidate in vitro assays using treatments with agents of known 
in vivo toxicity and for which in vivo data on gene expression are available. The 
expression data and predictive models of this invention can be used to determine 
whether the in vitro assay system has predictive gene expression responses that 
accurately reflect the in vivo situation. Large sets of predictive genes as described in 
this invention can be tested in such models for their suitability and performance with 
the candidate in vitro systems. This is a superior and novel tool for evaluating and 
optimizing in vitro systems for their ability to reflect and accurately predict in vitro 
responses. 

[117] In another embodiment, measurement of the expression levels of the 

proteins coded for by the predictive genes can be used in conjunction with predictive 
models to predict kidney toxicity. Among the full set of kidney toxicity predictive 
genes are various genes known to encode cell surface, secreted and/or shed 
proteins. This enables the development of methods for predicting toxicity using 
protein biomarkers. Example 1 1 presents a process by which candidate protein 
biomarker genes may be selected from biomarker genes identified from transcription 
expression. For example, as disclosed in Table 37, there are 23 genes in the master 
predictive set which are known to encode secreted proteins. As disclosed in Table 
43, predictive protein marker candidates may also be selected by categorizing a 
number of other parameters related to the predictive performance and potential use 
as protein markers. In Example 1 1 , the utility of this concept has been demonstrated 



[116] 



In another embodiment, the predictive genes can be used in a variety of 
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by testing for serum protein levels of one of the identified biomarkers, insulin-like 
growth factor binding protein 1 . The serum protein levels of this biomarker parallel 
the kidney transcription levels and distinguish kidney toxic from non-toxic treatments. 
Thus, in another aspect of the present invention, kidney toxicity predictive assays 
which detect the expression of one or more of said predictive proteins may be 
developed. Such assays may have several advantages, such as: 

(1) Ability to use archived tissue specimens such as preserved or embedded 
tissues that are not suitable for measurement of RNA expression 

(2) Ability to examine predictive protein expression in tissue slides using in 

situ labeling and microscopic observation. This is useful for detecting 
toxicity predictive signals occurring in very small subpopulations of cells. 

(3) Ability to detect protein markers in specimens that can be readily obtained 
with little or no invasiveness (e.g., blood, urine, sweat, saliva). 

(4) Reduction in animal use in laboratory studies such that no sacrifice of 
animals necessary to obtain tissue specimens when toxicity prediction can 
be made with specimens that can be obtained without animal sacrifice or 
surgery. 

(5) Application for human use where tissue specimens cannot be obtained or 
are only obtained with great difficulty. 

[118] In another embodiment, the identified predictive genes can be considered as 

potential therapeutic targets when the genes are involved in toxic damage or repair 
responses whose expression or functional modification may attenuate, ameliorate or 
eliminate disease conditions or adverse symptoms of disease conditions. 

[119] In another embodiment, the predictive genes can be organized into clusters 

of genes that exhibit similar patterns of expression by a variety of statistical 
procedures commonly used to identify such coordinately expression patterns. 
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Common functional properties of these clustered genes can be used to provide 
insight into the functional relationship of the response of these genes to toxic effects. 
Common genetic properties of these genes (e.g., common regulatory sequences) 
may provide insight into functional aspects by revealing known or novel similarities in 
the coding region of the genes. The presence of common known or novel signal 
transduction systems that regulate expression of the genes can also lead to insight 
as to the functional properties of the genes. The presence of common known or 
novel regulatory sequences in the identified predictive genes can also be used to 
identify toxicity predictive genes that are not present in the current Rat CT array. 
This can be accomplished by someone skilled in the art who can analyze sequence 
databases for common regulatory sequences. 

[1 20] In yet another embodiment, the kidney toxicity predictive genes can be used 

to predict toxicity responses in other species, for example, human, non-human 
primate, mouse, hamster, guinea pig, rabbit, cattle, sheep, pig, chicken, and dog. 
Some members of the kidney toxicity predictive genes may also be more suitable for 
prediction of toxicity in species other than the species used to derive the database 
(rat in the case of the examples provided). One method for identification of such 
genes is that would be available to someone skilled in the art would be to examine 
DNA sequence databases to determine whether orthologous sequences to the 
predictive genes exist in the target species and how close the orthologous 
sequences are to the predictive gene sequences. One of skill in the art can examine 
the orthologous sequences for similarity in amino acid coding regions and motifs as 
well as for similarities in regulatory regions and motifs of the gene. 

[121] In another embodiment, kidney toxicity predictive genes or gene sequences 

are used for screening other potential toxicity predictive genes or gene sequences in 
other species or even within the same species using methods known in the art. See, 
for example, Sambrook supra. Gene sequences which hybridize under stringent 
conditions to the kidney toxicity predictive gene sequences disclosed herein are 
selected as potential toxicity predictive genes. Gene sequences which hybridize to 
the kidney toxicity predictivity gene of this invention can show homology to the kidney 
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toxicity predictivity genes, preferably at least about 50%, 60%, 70%, 80%, or 90% 
identical to the kidney toxicity predictivity genes disclosed herein. It is understood 
that conservative substitutions of amino acids are possible for gene sequences which 
have some percentage homology with the kidney toxicity predictive gene sequences 
of this invention. A conservative substitution in a protein is a substitution of one 
amino acid with an amino acid with similar size and charge. Groups of amino acids 
known normally to be equivalent are: (a) Ala, Ser, Thr t Pro, and Gly; (b) Asn, Asp, 
Glu, and Gin; (c) His, Arg, and Lys; (d) Met, Glu, He, and Val; and (e) Phe, Tyr, and 
Trp. 

[122] It is also understood that the toxicity predictive genes can be used as guides 

to predicting toxicity for agents that have been administered via different routes (, 
intravenous, oral, dermal, inhalation, I, etc.) from the routes that were used to 
generate the database or to identify the toxicity predictive genes. Furthermore, the 
invention is not intended to be limiting to agents that have been administered at 
different dosages than the agents that were used to generate the database or to 
identify the toxicity predictive genes. 

[123] Data described in the examples were generated using the microarray 

technology disclosed in the Examples. However, the invention is not dependent on 
using this particular platform. Other similar gene expression analysis technologies 
may be incorporated in the practice of this invention. These can include, but are not 
limited to, other arrays containing the predictive genes, RT-PCR (e.g., TaqMan®), 
branched chain technology, RNAs protection or any other method which 
quantitatively detects the expression of RNA polynucleotides. The invention can be 
practiced using these other technologies by generating a database of expression 
measurements for the predictive genes using samples such as those used in the 
database described in Example 1 . This database can then be used in a model such 
as the K-nearest neighbor model or can be used to develop any of a number of other 
models. 

[124] The following Examples are provided to illustrate but not to limit the invention 
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in any manner. 
EXAMPLES 

[125] Example 1 Discovery of Kidney Toxicity Predictive Genes from 24 Hour 

Expression Data. Materials and Methods:(A) Database of Compounds and Kidney 
Toxicity: Compounds and treatments list used to construct the kidney database are 
given in Table 1 . This table also provides the evaluation of the kidney toxicity 
observed as kidney tubular necrosis in samples collected 72 hours after treatment. 

[126] (B) Database of Animal Experiments: Sprague Dawley rats CitCD from 

Charles River, Raleigh, NC were divided into treated rats that receive a specific 
concentration of the compound (see Table 1) and the control rats that only received 
the vehicle in which the compound is mixed {e.g., saline). 

[127] At specified timepoints (6h t 24h and 72h) after administration (intraperitoneal 

route) of the compound, a set number of rats (usually 3 control and 3 treated) were 
euthanized and tissues collected. Each rat was heavily sedated with an overdose of 
C02 by inhalation and a maximum amount of blood drawn. Exsanguination of the rat 
by this drawing of blood kills the rat. The method of collecting the tissues is very 
important and ensures preserving the quality of the mRNA in the tissues. The body 
of the rat was then opened up and prosectors rapidly removed the tissues (including 
kidney) and immediately placed them into liquid nitrogen. All of the organs/tissues 
were completely frozen within 3 minutes of the death of the animal to ensure that 
mRNA did not degrade. The organs/tissues were then packaged into well-labeled 
plastic freezer quality bags and stored at -80 degrees until needed for isolation of the 
mRNA from a portion of the organ/tissue sample. 

[128] (C) Isolating DNA/RNA from animal tissues or cells: Total RNA was isolated 

from kidney tissue samples using the following materials: Qiagen RNeasy midi kits, 
2-mercaptoethanol, liquid N 2 , tissue homogenizer, dry ice Samples were kept on ice 
when specified. 
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[129] If a tissue needed to be broken, then the tissue sample was placed on a 

double layer of aluminum foil which was then placed within a weigh boat containing a 
small amount of liquid nitrogen. The aluminum foil was folded around the tissue and 
then struck by a small foil-wrapped hammer to administer mechanical stress forces. 

[130] About 0.15-0.20 g of kidney tissue was weighed out and placed in a sterile 

container. To preserve integrity of the RNA, all tissues were kept on dry ice when 
other samples were being weighed. A RLT (Qiagen®) buffer buffer was added to the 
sample to aid in the homogenization process. The tissue was homogenized using 
commercially available homogenizer ( IKA Ultra Turrax T25 homogenizer) with the 7 
mm microfine sawtooth shaft and generator (195 mm long with a processing range of 
0.25 ml to 20 ml, item # 372718). After homogenization, samples were stored on ice 
until all samples were homogenized. The homogenized tissue sample was spun to 
remove nuclei thus reducing DNA contamination. The supernatant of the lysate was 
then transferred to a clean container containing an equal volume of 70% EtOH in 
DEPC treated H 2 0 and mixed. RNA was isolated by putting the supernatant through 
an RNeasy spin column, washed, and subsequently eluted. Small quantities of 
remaining DNA were removed by use of DNase enzyme during the RNA isolation 
procedure following the instructions provided by Qiagen and alternatively by lithium 
chloride (LiCI) precipitation following the RNA isolation. The isolated RNA pellet was 
stored in Rnase-free water or in an RNA storage buffer (10 mM sodium citrate), 
Ambion Cat #7000. The RNA amount was then quantitated using a 
spectrophotometer. 

[131] (D) Rat 700 CT chip; Gene expression data was generated from a microarray 

chip that has a set of toxicologically relevant rat genes which are used to predict 
toxicological responses. The rat 700 CT gene array is disclosed in U.S. applications 
60/264,933; 60/308,161; and pending application filed on January 29, 2002 that 
claims priority to 60/264,933 and 60/308,161 [Attorney docket 40074-2000600]. 

[132] (E) Microarray RT reaction: Fluorescence-labeled first strand cDNA probe 

was made from the total RNA or mRNA isolated from kidneys of control and treated 
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rats. This probe was hybridized to microarray slides spotted with DNA specific for 
toxicologically relevant genes. The materials needed are: total or messenger RNA, 
primer, Superscript II buffer, dithiothreitol (DTT), nucleotide mix, Cy3 or Cy5 dye, 
Superscript II (RT), ammonium acetate, 70% EtOH, PCR machine, and ice. 



mRNA) was calculated. The amount of DEPC water needed to bring the total volume 
of each RNA sample to 14 jul was also calculated. If RNA was too dilute, the 
samples were concentrated to a volume of less than 14 //I in a speedvac without 
heat. The speedvac must be capable of generating a vacuum of 0 Milli-Torr so that 
samples can freeze dry under these conditions. Sufficient volume of DEPC water 
was added to bring the total volume of each RNA sample to 14 pi Each PCR tube 
was labeled with the name of the sample or control reaction. The appropriate volume 
of DEPC water and 8 jj\ of anchored oligo dT mix (stored at -20°C) was added to 
each tube. 

[1 34] Then the appropriate volume of each RNA sample was added to the labeled _ 

PCR tube. The samples were mixed by pipeting. The tubes were kept on ice until all 
samples are ready for the next step. It is preferable for the tubes to kept on ice until 
the next step is ready to proceed. The samples were incubated in a PCR machine 
for 10 minutes at 70°C followed by 4°C incubation period until the sample tubes were 
ready to be retrieved. The sample tubes were left at 4°C for at least 2 minutes. 

[135] The Cy dyes are light sensitive, so any solutions or samples containing Cy- 

dyes should be kept out of light as much as possible (e.g., cover with foil) after this 
point in the process. Sufficient amounts of Cy3 and Cy5 reverse transcription mix 
were prepared for one to two more reactions than would actually be run by scaling up 
the following: 

[1 36] For labeling with Cy3 

8 ul 5x First Strand Buffer for Superscript II 
4ul 0.1 M DTT 
2 ul Nucleotide Mix 

2 ul of 1:8 dilution of Cy3 {e.g.„ 0.125mM cy3dCTP). 



[133] 



The volume of each sample that would contain 20//g of total RNA (or 2jjg of 
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2 ul Superscript II 



[137] 



For labeling with Cy5 



8 ul 5x First Strand Buffer for Superscript II 
4 ul 0.1 MDTT 
2 ul Nucleotide Mix 

2 ul of 1:10 dilution of Cy5 {e.g.,, 0.1 mM CySdCTP). 
2 ul Superscript II 

[1 38] About 1 8 //I of the pink Cy3 mix was added to each treated sample and 1 8 /yl 

of the blue Cy5 mix was added to each control sample. Each sample was mixed by 
pipeting. The samples were placed in a DNA engine (PTC-200 Petier Thermal 
Cycler, MJ Research) for 2 hours at 45°C followed by 4°C until the sample tubes 
were ready to be retrieved. 

[139] In addition to the desired cDNA product, the completed RT reaction 

contained impurities that must be removed. These impurities included excess 
primers, nucleotides, and dyes. The primary method of removing the impurities was 
by following the instructions in the QIAquick PGR purification kit (Qiagen 
cat#120016). 

[140] Alternatively, the completed RT reactions were cleaned of impurities by 

ethanol precipitation and resin bead binding. The samples from DNA engine were 
transferred to Eppendorf tubes containing 600 \x\ of ethanol precipitation mixture and 
placed in -80°C freezer for at least 20-30 minutes. These samples were centrifuged 
for 15 minutes at 20800 x g (14000 rpm in Eppendorf model 541 7C) and carefully the 
supernatant was decanted. A visible pellet was seen (pink/red for Cy3, blue for Cy5). 
Ice cold 70% EtOH (about 1 ml per tube) was used to wash the tubes and the tubes 
were subsequently inverted to clean tube and pellet. The tubes were centrifuged for 
10 minutes at 20800 x g (14000 rpm in Eppendorf model 5417C), then the 
supernatant was carefully decanted. The tubes were air dried for about 5 to 1 0 
minutes, protected from light. When the pellets were dried, they were resuspended 
in 80 ul nanopure water. The cDNA/mRNA hybrid was denatured by heating for 5 
minutes at 95°C in a heat block and flash spun. Then the lid of a "Millipore MAHV 
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N45" 96 well plate was labeled with the appropriate sample numbers. A blue gasket 
and waste plate (v-bottom 96 well) was attached. About 160 (xl of Wizard DNA 
Binding Resin (Promega cat#A1 151) was added to each well of the filter plate that 
was used. Probes were added to the appropriate wells (80 \i\ cDNA samples) 
containing the Binding Resin. The reaction is mixed by pipeting up and down -10 
times. The plates were centrifuged at 2500 rpm for 5 minutes (Beckman GS-6 or 
equivalent) and then the filtrate was decanted. About 200 \xl of 80% isopropanol was 
added, the plates were spun for 5 minutes at 2500 rpm f and the filtrate was 
discarded. Then the 80% isopropanol wash and spin step was repeated. The filter 
plate was placed on a clean collection plate (v-bottom 96 well) and 80 \i\ of Nanopure 
water, pH 8.0-8.5 was added. The pH was adjusted with NaOH. The filter plate was 
secured to the collection plate and after 5 minutes was centrifuged for 7 minutes at 
2500 rpm. 

[141] (F) Purification of Cy -Dye Labeled cDNA: To purify fluorescence-labeled 

first strand cDNA probes, the following materials were used: Millipore MAHV N45 96 
well plate, v-bottom 96 well plate (Costar), Wizard DNA binding Resin, wide orifice 
pipette tips for 200 to 300 y\ volumes, isopropanol, nanopure water. It is highly 
preferable to keep the plates aligned at all times during centrifugation. Misaligned 
plates lead to sample cross contamination and/or sample loss. It is also important 
that plate carriers are seated properly in the centrifuge rotor. 

[142] The lid of a "Millipore MAHV N45" 96 well plate was labeled with the 

appropriate sample numbers. A blue gasket and waste plate (v-bottom 96 well) was 
attached. Wizard DNA Binding Resin (Promega cat#A1151) was shaken 
immediately prior to use for thorough resuspension. About 160 jxl of Wizard DNA 
Binding Resin was added to each well of the filter plate that was used. If this was 
done with a multi-channel pipette, wide orifice pipette tips would have been used to 
prevent clogging. It is highly preferable not to touch or puncture the membrane of the 
filter plate with a pipette tip. Probes were added to the appropriate wells (80 ^il cDNA 
samples) containing the Binding Resin. The reaction is mixed by pipeting up and 
down -10 times. It is preferable to use regular, unfiltered pipette tips for this step. 
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The plates were centrifuged at 2500 rpm for 5 minutes (Beckman GS-6 or equivalent) 
and then the filtrate was decanted. About 200 |al of 80% isopropanol was added, the 
plates were spun for 5 minutes at 2500 rpm, and the filtrate was discarded. Then the 
80% isopropanol wash and spin step was repeated. The filter plate was placed on a 
clean collection plate (v-bottom 96 well) and 80 jxl of Nanopure water, pH 8.0-8.5 was 
added. The pH was adjusted with NaOH. The filter plate was secured to the 
collection plate with tape to ensure that the plate did not slide during the final spin. 
The plate sat for 5 minutes and was centrifuged for 7 minutes at 2500 rpm. 
Replicates of samples should be pooled. 

[143] (G) Dry-down Process: Concentration of the cDNA probes is preferable so 

that they can be resuspended in hybridization buffer at the appropriate volume. The 
volume of the control cDNA (Cy-5) was measured and divided by the number of 
samples to determine the appropriate amount to add to each test cDNA (Cy-3). 
Eppendorf tubes were labeled for each test sample and the appropriate amount of 
control cDNA was allocated into each tube. The test samples (Cy-3) were added to 
the appropriate tubes. These tubes were placed in a speed-vac to dry down, with foil 
covering any windows on the speed vac. At this point, heat (45°C) may be used to 
expedite the drying process. Samples may be saved in dried form at -20°C for up to 
14 days. 

[144] (H) Microarray Hybridization: To hybridize labeled cDNA probes to single 

stranded, covalently bound DNA target genes on glass slide microarrays, the 
following material were used: formamide, SSC, SDS, 2//m syringe filter, salmon 
sperm DNA (Sigma, cat # D-7656), human Cot-1 DNA (Life Technologies, cat # 
15279-01 1), poly A (40 mer: Life Technologies, custom synthesized), yeast tRNA 
(Life Technologies, cat # 15401-04), hybridization chambers, incubator, coverslips, 
parafilm, heat blocks. It is preferable that the array is completely covered to ensure 
proper hybridization. 

[145] About 30 \i\ of hybridization buffer was prepared per cDNA sample (control 

rat cDNA plus treated rat cDNA). Slightly more than is what is needed should be 
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made since about 100 ^tl of the total volume made for all hybridizations can be lost 
during filtration. 

Hybridization Buffer: for 100 u\\ 

• 50% Formamide 50 \ti formamide 

• 5X SSC 25 20X SSC 

• 0.1% SDS 25 *il 0.4% SDS 



[146] The solution was filtered through 0.2 /ym syringe filter, then the volume was 

measured. About 1 jal of salmon sperm DNA (10mg/ml) was added per 100 fj\ of 
buffer. 

[147] Alternatively, the hybridization buffer was made up as: 

Hybridization Buffer: for 101 u\\ 

• 50% Formamide 50 ul formamide 

• 1 0X SSC 50nl20XSSC 

• 0.2% SDS 1 ul 20% SDS 

[1 48] The solution was filtered through 0.2 //m syringe filter, then the volume was 

measured. One microliter of salmon sperm DNA (9.7mg/ml), 0.5 jjJ Human Cot-1 
DNA (5 ng/nl), 0.5 \ti poly A (5 jig/jil), 0.25 fj\ Yeast tRNA (10 jig/jxl) was added per 
100 //I of buffer. The hybridization buffers were compared in validation studies and 
there was no change in differential gene expression data between the two buffers. 

[149] Materials used for hybridization were: 2 Eppendorf tube racks, hybridization 

chambers (2 arrays per chamber), slides, coverslips, and parafilm. About 30 jal of 
nanopure water was added to each hybridization chamber. Slides and coverslips 
were cleaned using N 2 stream. About 30 \i\ of hybridization buffer was added to 
dried probe and vortexed gently for 5 seconds. The probe remained in the dark for 
10-15 minutes at room temperature and then was gently vortexed for several 
seconds and then was flash spun in the microf uge. The probes were boiled or 
placed in a 95 °C heat block for 5 minutes and centrifuged for 3 min at 20800 x g 
(14000 rpm, Eppendorf model 541 7C). Probes were placed in 70 °C heat block. 
Each probe remained in this heat block until it was ready for hybridization. 
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[150] 



About 25 |il was pipeted onto a coverslip. It is highly preferable to avoid the 



material at the bottom of the tube and to avoid generating air bubbles. This may 
mean leaving about 1 \s\ remaining in the pipette tip. The slide was gently lowered, 
face side down, onto the sample so that the coverslip covered that portion of the slide 
containing the array. Slides were placed in a hybridization chamber (2 per chamber). 
The lid of the chamber was wrapped with parafilm and the slides were placed in a 
42°C humidity chamber in a 42°C incubator. It is preferable to not let probes or 
slides sit at room temperature for long periods. The slides were incubated for 18-24 
hours. 

[151] (I) Post-Hybridization Washing: To obtain only single stranded cDNA probes 

tightly bound to the sense strand of target cDNA on the array, all non-specifically 
bound cDNA probe should be removed from the array. Removal of all non- 
specifically bound cDNA probe was accomplished by washing the array and using 
the following materials: slide holder, glass washing dish, SSC, SDS, and nanopure 
water. Six glass buffer chambers and .glass slide holders were set up with 2X SSC 
buffer heated to 30-34°C and used to fill up glass dish to 3/4th of volume or enough 
to submerge the microarrays. The slides were placed in 2X SSC buffer for 2 to 4 
minutes while the cover slips fall off. The slides were then moved to 2X SSC, 0.1% 
SDS and soaked for 5 minutes. The slides were transferred into 0.1X SSC and 0.1% 
SDS for 5 minutes. Then the slides are transferred to 0.1 X SSC for 5 minutes. The 
slides, still in the slide carrier, were transferred into nanopure water (18 megaohms) 
for 1 second. To dry the slides, the stainless steel slide carriers were placed on 
micro-carrier plates and spun in a centrifuge (Beckman GS-6 or equivalent) for 5 
minutes at 1000 rpm. 

[152] (J) Scanning slides: The washed and dried hybridized slides were scanned 

on Axon Instruments Inc. GenePix 4000A MicroArray Scanner and the fluorescent 
readings from this scanner converted into quantitation files (.gpr) on a computer 
using GenePix software. 

[153] II. Array Data, Normalization and Transformation: GeneSpring™ software 
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(Version 4.1, Silicon Genetics) was used for statistical analyses including 
identification of genes expressions correlating with histopathology scores, K-means 
and tree cluster analysis, and predictive modeling using the K-means nearest 
neighbor (Predict Parameter Values tool). 

[154] Microarray data were loaded into GeneSpring™ software for analysis as 

GenePix files as above. Initially, set A training set compounds (see Table 4) data 
from one microarray was used per animal. Next, set A test set compounds (see 
Table 4) replicate arrays for each animal were combined into one GenePix file. 
Specific data loaded into GeneSpring™ software included gene name, GenBank ID 
control channel mean fluorescence and signal channel mean fluorescence. 
Expression ratio data (ratio of signal to control fluorescence) were normalized using 
the 50 th percentile of the distribution of all genes and control channel. Ratio data 
were excluded from analysis if the control channel value was <0. For analysis of 
correlations and predictive values gene expression ratios were transformed as the 
log of the ratio. 

[155] Correlation with Histopathology Scores: Histopathology scores for each 

animal (assigned on a compound-dose basis as indicated in Table 1) were entered 

with gene expression data by using the GeneSpring™ 'Drawn Gene 1 function. 

Correlations between the histopathology scores and gene expression were 

conducted with the distance measures listed below: 

standard positive and negative correlation 
smooth positive and negative correlation 

change positive correlation 

upregulated positive correlation 

Pearson positive and negative correlation 
Spearman positive and negative correlation 

distance positive correlation 

[156] These correlation or similarity measures are standard statistical correlation 

measures that are described in the GeneSpring Advanced Analysis Techniques 
Manual (Release Data March 13, 2001, Silicon Genetics). Where both positive and 
negative correlations were obtained combined positive and negative correlating gene 
lists were also created. 
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[157] IV. Class Prediction: The Predict Parameter Values tool in GeneSpring™ 

software was used for kidney toxicity class prediction. The following is a summary of 
the procedure used in the GeneSpring predictive software. This is described in 
GeneSpring Advanced Analysis Techniques Manual (Release Data March 13, 2001, 
Silicon Genetics) with additional information supplied by Silicon Genetics and a 
statistical expert. The prediction tool relies on standard statistical procedures that 
can be implemented in a variety of statistical software packages. 

[158] (IV)(A) Gene Selection: The first step is variable selection of genes to be 

used for prediction. This entails taking a single gene and a single class (e.g., kidney 
toxicity) and creating a contingency table. In the table below, columns 1 through N of 
the table each represent one possible cutoff point based on the gene expression 
level (ratio of signal/control) for that class. The number of possible cutoffs is less 
than or equal to the total number of samples for the class (e.g., A). It is possibly less 
than the total number, since there may be ties in gene expression level. Hence, W, 
M, and Xmay or may not be distinct. In the example, an n-class problem is 
illustrated, where xand / entries are the class counts at that gene expression cutoff 
level, for that specific gene and class, either above ("a") or below ("b") the cutoff. 
"Classl" is the set of all samples (above or below) the cutoff for Classl, and "!Class1" 
are all those not in Classl (above or below) the cutoff, and similarly for the other 
classes. The class totals in the training set are the total class marginals used to 
compute Fisher's exact test. 

[159] For a specific gene, and for each class, the best p-value as calculated by 

Fisher's Exact Test for independence between one of the pair of columns (e.g., 1a 
and 1 b) and the actual class totals (e.g., A) is used to score the gene (-/n(p) = the 
score) for that class. Thus, there are N (or, M, Q etc.) contingency tables, where the 
best score of the N tables is used for that class and gene. If there is a wide disparity 
between the above and below counts in either the a or b column (this is a two-sided 
Fisher's Exact Test), the smaller the p-value and the higher the score. 

[160] The genes per class are rank ordered by the most discriminating (highest) 
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score. The predictivity list is composed of the most discriminating genes per class. 
Namely, genes are combined that best discriminate class 1 with those that best 
discriminate class 2 and so on. The genes are selected in rotation of the highest 
score per class. Duplicate genes are ignored in the rotation and not added to the list, 
the gene with the next highest score is taken. 

[161] The training samples now have only the gene list garnered from the above 

procedure. As an example, where once the training samples may have had an initial 
list of 200 genes per sample, they now have only a subset composed of the gene list, 
for example, 50 (the number of predictivity genes specified) that are selected from 
the initial list by the gene selections procedure. Thus, each sample is a vector of 50 
normalized expression ratios. Since the selection of genes is done in rotation, the list 
contains 25 genes for one class, and 25 for the other class. The matrix below 



illustrates the basic features of this gene selection process. 



Gene 1 


1a 


1b 




Na 


Na 




Class 


Expression 
above 


Expression 
below 




Expression 
above 


Expression 
below 


Actual Class 

Totals 
(Marginals) 


Classl 


x1.1a 


x1.1b 




xl.Na 


xl.Nb 


A 


!Class1 


y1.1a 


y1.1b 




yl.Na 


yl.Nb 


B 


Gene 1 


1 


2 




M 






Class2 


x1.2a 


x1.2b 




xl.Ma 




C 


!Class2 


y1.2a 


y1.2b 




yl.Ma 




D 
















Gene 1 


1 


2 




Qa 


Qb 




Classn 


xl.na 


xl.nb 




xl.Qa 


xl.Qb 


X 


IClassn 


yl.na 


yl.nb 




yi.Qa 


yl.Qb 


Y 



[162] After the genes to be used in the training set have been selected, the test set 

is classified based on the /c-nearest neighbor (knn) voting procedure. Using just 
those genes in the gene list, for each sample in the test set of samples, the k nearest 
neighbors in the training set are found with the Euclidean distance. The class in 
which each of the k nearest neighbors is determined, and the test set sample is 
assigned to the class with the largest representation in the k nearest neighbors after 
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adjusting for the proportion of classes in the training set. 

[163] For example, in a,two-class problem, let there be 30 samples of class 1 and 

60 samples of class 2 in the training set. With k = 9 say it can be determined that 7 
of the nearest neighbors to a sample from the testing set are in class 1 . The sample 
can then be classified as being a member of class 1 . If another sample from the test 
set has a total of 4 nearest neighbors in class 1, after adjusting for the proportion, this 
sample would be assigned to class 1 rather than class 2, even though the majority 
vote suggests assignation to class 2. 

[164] VI. Decision Threshold: The decision threshold is a mechanism to help 

clearly define the class into which the sample will fall, and can be set to reject 
classification if the voting is very close or tied. (Thus, k can be even for two-class 
problems without worrying about the tie problem.) A p-value is calculated for the 
proportion of neighbors in each class against the proportions found in the training set, 
again using Fisher's exact test, but now a one-sided test. 

[1 65] For example, let k = 1 1 , if the proportion of neighbors of class 1 in the test 

set is 6/1 1 , and the proportion of class 1 in a 100 sample training set is 0.4, the /> 
value calculated is 0.29 (half the two-sided test). If the proportion in the training set is 
0.1 , the p-value is 0.004. The smaller the p-value the greater the likelihood that the 
sample from the testing set belongs to that class. 

[166] A p-value ratio (P-value) is set as a way of setting the level of confidence in 

individual sample predictions based on the ratio of p-values for the best class (lowest 
p-value) versus the second best class (second lowest p-value). For example, if the 
P-value is set at 0.5 and the ratio of p-values for a particular sample is 0.6, then the 
predictive model will not make a call for that sample. 

[167] VII. Training and Test Data Sets: Data were each separated into 6 

training and test sets. The first training and test set was created by allocating one set 
of data as a training set (Set A training set) and another set of data as a test set (Set 
A test set). Other training and test sets were created by randomly distributing the 
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compounds into the sets. This was accomplished by assigning random numbers to 
lists of compounds that are negative and positive for histopathology, sorting by 
random number, and then dividing the sorted lists into a specific number of training 
and test sets. The training and test set assignments are presented in Table 4. 

[168] VIII. Kidney Toxicology Classification: Kidney toxicity classifications were 

entered for training and test set as a parameter column. Toxicity, as defined by 
observation of kidney tubular necrosis in the kidney at 72 hours after treatment, was 
entered as a "yes" or "no" for each animal in a compound-dose group. Additionally, a 
parameter column for random histopathology classification was designated. This 
was done by randomly assigning the same number of "yes" and "no" calls to the 
individual animals. 

[169] IX. Prediction Output and Initial Data Processing: The "Predict Parameter 

Value" tool of GeneSpring was used with each of the training and test sets to 
generate predictions of histopathology classifications of the test sets. Unless 
otherwise specified a nearest neighbor setting of 10 (default) and P-value ratio cutoff 
of 0.5 was used. The number of genes used to predict was varied with standard 
numbers of 50, 40, 30, 20, 1 0, 5, 2 and 1 genes used. For each number of genes the 
numbers of correct calls, incorrect calls and non-calls were recorded. Non-calls are 
cases where no prediction was made because the P-value ratio exceeded the 
specified P-value ratio cutoff. Calculations were made for overall percent correct 
calls (number of correct classifications/number or samples), percent correct calls of 
called samples (number of correct classifications/number of samples with calls) and 
percent of called samples (samples with calls/number of samples). 

[170] For each input list and optimal number of predictive genes (lowest number of 

genes giving a maximum overall percent of correct calls) additional information was 
recorded that included the list of specific genes in the optimum predictive set. 

[171] X. Results: Expression array data were first examined for the existence of 

genes whose expression correlated with histopathology scores. Table 1 presents a 
list of the compounds and dose levels along with the kidney histopathology 
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classification and histopathology severity scores used for this analysis. For each 
distance measure the probability was adjusted in increments of 0.05 until at least 50 
correlating genes were obtained. Lists of correlating genes were obtained using the 
distance measures described in Materials and Methods. Example sets of correlating 
genes are provided in Tables 2 and 3. 

[172] The correlating gene lists as well as the entire array gene list were provided 

as input lists to the GeneSpring Predict Parameter value tool (described in Materials 
and Methods) that employs a K-means nearest neighbor (knn) predictive model. 
These lists as well as the entire array gene list were used for each of the six training 
and test sets defined in Materials and Methods to generate predictions of 
histopathology classifications of the test sets. Input genes for the Predict Parameter 
Value feature included all 700 genes in the GenePix file (the rat CT Array) which was 
disclosed in a currently pending application (serial number [Attorney docket no. 
40074-2000600]) filed on January 29, 2002, as well as smaller lists of genes whose 
expressions correlated with histopathology by the correlation measures described 
previously. The number of genes used to predict are varied with standard numbers 
of 50, 40, 30, 20, 10, 5, 2 and 1 genes used. The specified number of predictive 
genes was varied to obtain an optimum number of predictive genes. Figure 2 
presents a typical profile for obtaining an optimum gene list. 

[1 73] After this was done for all 6 training and test sets, all gene lists were then 

merged to create one aggregate list of predictive genes. Each gene on this 
aggregate list has predictive value for at least one of the training and test sets 
because it was observed to contribute to an optimum predictivity for a specific 
training/test set. The aggregate list was subdivided into smaller lists of genes based 
on the number of times a gene was predictive for an individual training or test set. 
For example, if 6 training and test sets were used, genes that were predictive in all 6 
training and test sets were designated as Combo (combination) 6. Genes that were 
predictive in only 5 of 6 training and test sets were designated as Combo 5, etc. A 
list of predictive genes organized by their occurrence in the separate training and test 
sets is presented in Table 5. 
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[174] Example 2 Predictive Properties and Evaluation of Predictive Genes from 24 

Hour Expression Data 

(A) Materials and Methods: The database used was as described in Example 1. 

[175] (B) Array data, normalization procedures and transformations used in these 

analyses are as described in Example 1 . Table 39 presents 24 hour gene 
expression data for the predictive genes. These data can be used with a k-means 
nearest neighbor prediction model (as available in GeneSpring or other statistical 
software packages) to make predictions as described in this example. 

[176] (C) The Predict Parameter Values tool in GeneSpring™ software was used 

for kidney toxicity class prediction. A description of this tool and the statistical 
procedures used is provided in Example 1 . 

[177] (D) The training and test data sets used are those described in Table 4. 

[178] (E) Kidney toxicology classifications used are described in Table 1 . In this 

analysis randomized classifications (same number of "yes" and "no" classifications 
distributed randomly among the samples) were used. 

[179] (F) Prediction Output and Initial Data Processing: For each gene list 

prediction used for evaluation a table of data generated by the Predict Parameter 
Values tool in GeneSpring™ software was saved which provided for each sample in 
the test set the actual call ("yes" or "no" for kidney toxicity), the predicted call ("yes", 
"no" or no call for kidney toxicity) and the P-value cutoff ratio. This set of data was 
used to calculate predictive performance measures provided below. 

[180] (G) Measures of prediction used for these analyses are generally accepted 

prediction measures for information about actual and predicted classifications done 
by a classification system (Venables and Ripley, Modem Applied Statistics with S- 
Plus, 3rd edition, Springer, 1994 and Kubat and Matwin, Proc. 14th International 
Conference on Machine Learning, 1997). Results from predictions of a two class 
case can be described as a two-class matrix: 
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[181] Standard terms used for prediction are: Accuracy, which is the proportion of 

total number of predictions that are correct is calculated as: (a+d)/(a+b+c+d). 

[182] False positive rate is the proportion of negative cases that are incorrectly 

classified as positive is calculated as: b/a+b. 

[183] False negative rate is the proportion of positive cases that are incorrectly 

classified as negative is calculated as: c/c+d. 

[184] Geometric-mean is the performance measure that takes into account 

proportion of positive and negative cases (Kubat et al., ibid) is calculated as: the 
square root of TP*TN, where TP=true positive rate (d/c+d) and TN=true negative 
rate (a/a+b). In those cases where no prediction was made because the p-value 
ratio exceeded the cutoff-value (generally 0.5), the non-call was considered to be 
incorrect. 

[1 85] (H) Subsets of randomly selected genes were prepared from the predictive 

gene sets to test whether such subsets would have predictive value. Assignments of 
genes to these subsets are presented in Tables 6-10. 

[186] (I) Prediction results for 24 hour expression data using genes identified as 

predictive are presented in Table 1 1 . These data indicate a very high accuracy in 
predicting kidney toxicity. Mean accuracy exceeded 0.9 (90% accuracy) for the 
entire predictive gene list (Combo All) and the Combo 6 gene subset and 0.8 (80% 
accuracy) for the Combo 5 and 4 subsets. As expected, the predictive performance 
of the gene sets increased from the lowest occurrence gene list (Combo 1) to the 
highest occurrence gene list (Combo 6). 

[1 87] Because these predictions were conducted with multiple training/test set 

combinations, it is possible to obtain an indication of the variability in prediction rates 



44 



WO 03/100030 PCT7US03/06196 



and robustness of the prediction capabilities of these gene sets. For the Combo All 
and Combo 6, 5 and 4 gene sets there was very good predictivity for all training/test 
sets of data with over 0.8 accuracy as a minimum value for any one training and test 
set. False positive prediction rates were generally low with means less than 0.1 for 
Combo All and Combo 6, 5 and 4. Because the proportion of negative classifications 
was much higher than the proportion of positive (toxic) classifications in these sample 
sets the false negative rates would be expected to be higher than the false positive 
rates and this was observed to be the case. Although the false negative rates were 
higher than the false positive rates, there was still very good prediction of positive 
responses with mean false negative rates of about 0.3 for Combo All, Combo 6, 
Combo 5 and Combo 4 gene sets. The geometric mean was used as an indication of 
predictive performance that includes consideration of the proportion of positive and 
negative classifications. All gene sets gave geometric mean measures >0.5 and 
three gene sets (Combo AH, Combo 6 and Combo 5) had mean measures >0.8. 

[188] In these analyses, in cases where no prediction was made because the p- 

value ratio exceeded the cutoff-value (generally 0.5), the non-call was considered to 
be incorrect. 

[189] Prediction results for 24 hour expression data using genes identified as 

predictive and the predicting unit is compound-dose are presented in Table 12. This 
prediction unit is probably the most relevant for toxicology prediction. The 
performance of the genes in predicting compound-dose toxicity is even better than 
predictions on an individual animal basis. These data indicate a very high accuracy 
in predicting kidney toxicity. Mean accuracy exceeded 0.9 (90% accuracy) for the 
entire predictive gene list (Combo All) and Combo 6, 5, 4 and 3 gene lists. As 
expected, the predictive performance of the gene sets increased from the lowest 
occurrence gene list (Combo 1) to the highest occurrence gene list (Combo 6). 
Accuracy was better than 0.8 (80%) for the Combo 2 and Combo 1 lists. Variability 
in accuracy was low for most of the gene lists with >0.8 minimum accuracy for any 
single training and test set observed for the Combo All and Combo 6, 5, 4 and 3 
gene lists. Particularly noteworthy on the compound-dose level prediction is the low 
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false-negative rate observed for Combo All, Combo 6 and Combo 5 gene lists. The 
mean false negative rate was about 0.2 or less for these gene lists. As observed on 
an individual animal basis the false-positive rate was very low for all gene sets with 
mean rates of <0.12 for all gene sets. 



between effects of a compound at different dose levels. Two compounds, 
gancyclovir and cyclophosphamide, produced kidney toxicity at the high dose but not 
at the low dose. The predictive gene sets, particularly the Combo All, Combo 6 and 
Combos sets, accurately predicted toxicity at the high dose level, but not at the low 
dose level. 

[191] Prediction results for 24 hour expression data using genes identified as 

predictive and the predicting unit is compound are presented in Table 13. In terms of 
predicting toxicity of compounds the predictive capability was excellent with no 
compounds missed using the Combo 6 and Combo 5 gene sets and very low false 
positive rates for all of the gene sets. 

[192] Cumulative performance for the Combo gene lists was examined by adding 

genes one at a time in an order based on predictive weight as calculated by 
GeneSpring software. This order (and predictive weight) were different for each 
training set so a mean weight was used to obtain a single gene order for the 
predictive sets tested. The gene order is presented in Table 14. 

[193] Cumulative predictive performance for the Combo 6, Combo 5 and Combo 4 

predictive gene sets are presented in Figures 4-6. 

[1 94] The cumulative performance data clearly indicate that very good predictive 

performance can be achieved with small subsets of the Combo gene sets. For 
Combo 6, the accuracy reached a plateau level of about 90% at 3 genes. For 
Combo 5, a similar plateau level was reached with about 8 genes and for Combo 4 
the plateau level was reached with about 13 genes. This illustrates the increased 
predictive power of small sets of genes rather than single genes. The increased 



[190] 



One noteworthy feature of the predictive ability is the ability to distinguish 
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number of genes required to reach a predictive performance plateau of the different 
Combo sets is consistent with the hierarchy of performance prediction in the Combo 
sets. 

[1 95] Tables 1 5 and 1 6 show the level of predictive accuracy of individual genes of 

Combo 6 and Combo 5 (The top combo subsets with the highest levels, 92.1% and 
89.6%, respectively, of predictive accuracy on an individual sample basis) for 24 hour 
kidney data. 

[196] These tables show that overall, individual genes of both combo groups did 

not perform as well as the whole combination, as the average predictive accuracy of 
individual genes of Combo 6 was 67.7% and for Combo 5 was 62.7%. The table also 
shows that while some of the individual genes of both Combos gave a moderate to 
good level of predictive accuracy (as high as 79.7% for Combo 6 and as high as 
75.6% for Combo 5), the predictive accuracy of individual genes never exceeded the 
predictive accuracy of the whole combination. The data further support the 
cumulative gene predictivity conclusion that small subsets of genes have superior 
predictive power compared to individual genes. 

[197] In order to assess the performance of subsets of genes, predictive 

performance was evaluated for subsets of genes randomly selected from the total 
combined predictive list (Combo All) and the top Combo sets (as defined in Materials 
and Methods). Prediction results for 24 hour expression data using randomly 
selected subsets of genes are presented in Table 17. 

[198] These data clearly indicate that subsets of the Combo gene lists have 

predictive power. The predictive performance, as indicated by several measures 
including accuracy and geometric mean, increased in parallel with the predictive 
power of the gene set from which the genes were selected. The predictive power 
also generally increased as the number of randomly collected genes increased. In 
the case of the Combo 4, 5 and 6 sets, the 15 gene random subset had predictive 
performance that was close to that of the entire gene set. 
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[199] Table 18 compares prediction accuracy for correct classification of kidney 

toxicity and for the same proportion of positive and negative toxicity calls randomly 
assigned to the samples (random classification). For each gene set or subset 
predictions were made using the same six training/test sets as for the other prediction 
analyses. Additionally, sets of genes were randomly chosen from the array which 
were not identified on the list of 21 6 predictive genes at 24 hour (Example 1, Table 
10). 

[200] It is clear from these data that the predictions with accurate classification are 

much better than predictions with randomized classification. This means that the 
predictive results are not simply due to chance and large data sets but are due to 
significant, meaningful predictive association between the gene expression of the 
predictive genes and the kidney toxicity. The accuracy numbers for the gene sets 
selected from a list of all genes on the array minus the predictive genes are much 
lower than the Combo predictive lists and the random subsets of these predictive 
lists. This also verifies the predictive power of the identified predictive genes. The 
fact that the predictive numbers from these subsets are somewhat higher for 
accurate than random classification is likely due to some residual predictivity in these 
genes that is not very substantial. 

[201] Example 3: Discovery of Kidney Toxicity Predictive Genes from 6 Hour 

Expression Data: (A) Materials and Methods: Compounds and treatments list used to 
construct the kidney database are given in Example 1 . This table also provides the 
evaluation of the kidney toxicity observed as kidney tubular necrosis in samples 
collected 72 hours after treatment. The database is described in detail in Example 1 . 
This Example analyzes expression data from samples collected 6 hours after 
treatment. Array data, normalization and transformation procedures used were as 
described in Example 1. Procedures and methods for obtaining gene lists correlating 
with histopathology scores were as described in Example 1 with scores as in 
Example 1. The Predict Parameter Values tool in GeneSpring™ software used for 
kidney toxicity class prediction is described in detail in Material and Methods of 
Example 1 . 
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[202] 



(B) Training and Test Data Sets:Data were each separated into 6 training 



and test sets. The first training and test set was created by allocating one set of data 
as a training set (Set A training set) and another set of data as a test set (Set A test 
set). Other training and test sets were created by randomly distributing the 
compounds into the sets. This was accomplished by assigning random numbers to 
lists of compounds that are negative and positive for histopathology, sorting by 
random number, and then dividing the sorted lists into a specific number of training 
and test sets. The training and test set assignments are presented in Table 1 9. 

[203] (C) Kidney toxicity classifications were entered for training and test set as a 

parameter column. Toxicity, as defined by observation of kidney tubular necrosis in 
the kidney at 72 hours after treatment, was entered as a "yes" or "no" for each animal 
in a compound-dose group. Additionally, a parameter column for random 
histopathology classification was designated. This was done by randomly assigning 
"yes" and "no" calls to the individual animals. The total number of "yes" and "no" calls 
was maintained the same as in the correct classification, so that the proportion of 
"yes" and no calls was the same in all the training and test sets. 

[204] (D) Prediction Output and Initial Data Processing: The "Predict Parameter 

Value" tool of GeneSpring was used with each of the training and test sets to 
generate predictions of histopathology classifications of the test sets. Unless 
otherwise specified a nearest neighbor setting of 10 (default) and P-value ratio cutoff 
of 0.5 was used. The number of genes used to predict was varied with standard 
numbers of 50, 40, 30, 20, 10, 5, 2 and 1 genes used. For each number of genes the 
numbers of correct calls, incorrect calls and non-calls were recorded. Non-calls are 
cases where no prediction was made because the P-value ratio exceeded the 
specified P-value ratio cutoff. Calculations were made for overall percent correct 
calls (number of correct classifications/number or samples), percent correct calls of 
called samples (number of correct classifications/number of samples with calls) and 
percent of called samples (samples with calls/number of samples). 

[205] For each input list and optimal number of predictive genes (lowest number of 
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genes giving a maximum overall percent of correct calls) additional information was 
recorded that included the list of specific genes in the optimum predictive set. 

[206] (E) Results: Expression array data were first examined for the existence of 

genes whose expression correlated with histopathology scores. Materials and 
Methods of Example 1 presents a list of the compounds and dose levels along with 
the kidney histopathology classification and histopathology severity scores used for 
this analysis. For each distance measure the probability was adjusted in increments 
of 0.05 until at least 50 correlating genes were obtained. Lists of correlating genes 
were obtained using the distance measures described in Materials and Methods. 
Example sets of correlating genes are provided in Tables 20 and 21 . The correlating 
gene lists as well as the entire array gene list were provided as input lists to the 
GeneSpring Predict Parameter value tool (described in Materials and Methods) that 
employs a K-means nearest neighbor (knn) predictive model. These lists as well as 
the entire array gene list were used for each of the six training and test sets defined 
in Materials and Methods to generate predictions of histopathology classifications of 
the test sets. Input genes for the Predict Parameter Value feature included all 700 
genes in the GenePix file (the rat CT Array) as well as smaller lists of genes whose 
expressions correlated with histopathology by the correlation measures described 
previously. The number of genes used to predict are varied with standard numbers 
of 50, 40, 30, 20, 10, 5, 2 and 1 genes used. The specified number of predictive 
genes was varied to obtain an optimum number of predictive genes. 

[207] After this was done for all 6 training and test sets, all gene lists were then 

merged to create one aggregate list of predictive genes. Each gene on this 
aggregate list has predictive value for at least one of the training and test sets 
because it was observed to contribute to an optimum predictivity for a specific 
training/test set. The aggregate list was subdivided into smaller lists of genes based 
on the number of times a gene was predictive for an individual training or test set. 
For example, if 6 training and test sets were used, genes that were predictive in all 6 
training and test sets were designated as Combo (combination) 6. Genes that were 
predictive in only 5 of 6 training and test sets were designated as Combo 5, etc. 
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[208] A list of predictive genes organized by their occurrence in the separate 

training and test sets is presented in Table 22. 

[209] Example 4 Predictive Properties and Evaluation of Predictive Genes from 6 

Hour Expression Data: (A) Materials and Methods: The database used was as 
described in Example 1. Array data, normalization procedures and transformations 
used in these analyses are as described in Example 1. Table 38 presents 6 hour 
gene expression data for the predictive genes. These data can be used with a k- 
means nearest neighbor prediction model (as available in GeneSpring or other 
statistical software packages) to make predictions as described in this example. The 
Predict Parameter Values tool in GeneSpring™ software was used for kidney toxicity 
class prediction. A description of this tool and the statistical procedures used is 
provided in Example 1 . 

[21 0] (B) Training and Test Data Sets: The training and test data sets used are 

those described in Table 19. 

[21 1] (C) Kidney Toxicology Classification: Kidney toxicology classifications used 

are described in Example 1. In this analysis randomized classifications (same 
number of "yes" and "no" classifications distributed randomly among the samples) 
were used. 

[212] (D) Prediction Output and Initial Data Processing: For each gene list 

prediction used for evaluation a table of data generated by the Predict Parameter 
Values tool in GeneSpring™ software was saved which provided for each sample in 
the test set the actual call ("yes" or "no" for kidney toxicity), the predicted call ("yes", 
"no" or no call for kidney toxicity) and the P-value cutoff ratio. This set of data was 
used to calculate predictive performance measures provided below. 

[213] (E) Prediction Measures: Measures of prediction used for these analyses are 

generally accepted prediction measures for information about actual and predicted 
classifications done by a classification system (Venables and Ripley, /Wdand Kubat 
and Matwin, Arid). Results from predictions of a two class case can be described as 
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a two-class matrix as described above. 

[214] (F) Results: Prediction results for 6 hour expression data using genes 

identified as predictive and the predicting unit is compound-dose are presented in 
Table 23. This prediction unit is probably the most relevant for toxicology prediction. 
The performance of the genes in predicting compound-dose toxicity is even better 
than predictions on an individual animal basis. 

[215] These data indicate some accuracy in predicting kidney toxicity. Mean 

accuracy exceeded 0.7 (70% accuracy) for the entire predictive gene list (Combo All) 
and Combo 6 and 5 gene lists. As expected, the predictive performance of the gene 
sets generally increased from the lowest occurrence gene list (Combo 1) to the 
highest occurrence gene list (Combo 6) with the exception of the Combo 5 list. Mean 
false negative values were in the range of 0.4-0.6 as were the geometric mean 
measures. 

[216] Example 5 Discovery of Kidney Toxicity Predictive Genes from 72 Hour 

Expression Data: (A) Materials and Methods: Compounds and treatments list used to 
construct the kidney database are given in Example 1 . This table also provides the 
evaluation of the kidney toxicity observed as kidney tubular necrosis in samples 
collected 72 hours after treatment. The Database is described in detail in Example 1. 
This Example analyzes expression data from samples collected 6 hours after 
treatment. Array data, normalization and transformation procedures used were as 
described in Example 1. Procedures and methods for obtaining gene lists correlating 
with histopathology scores were as described in Example 1 with scores as in 
Example 1. The Predict Parameter Values tool in GeneSpring™ software used for 
kidney toxicity class prediction is described in detail in Material and Methods of 
Example 1. 

[217] (B) Training and Test Data Sets; Data were each separated into 6 training 

and test sets. The first training and test set was created by allocating one set of data 
as a training set (Set A training set) and another set of data as a test set (Set A test 
set). Other training and test sets were created by randomly distributing the 
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compounds into the sets. This was accomplished by assigning random numbers to 
lists of compounds that are negative and positive for histopathology, sorting by 
random number, and then dividing the sorted lists into a specific number of training 
and test sets. 

The training and test set assignments are presented in the following Table 24. 

[21 8] (C) Kidney Toxicology Classification; Kidney toxicity classifications were 

entered for training and test set as a parameter column. Toxicity, as defined by 
observation of kidney tubular necrosis in the kidney at 72 hours after treatment, was 
entered as a "yes" or "no" for each animal in a compound-dose group. Additionally, a 
parameter column for random histopathology classification was designated. This 
was done by randomly assigning "yes" and "no" calls to the individual animals. The 
total number of "yes" and "no" calls was maintained the same as in the correct 
classification, so that the proportion of "yes" and no calls was the same in all the 
training and test sets. 

[219] (D) Prediction Output and Initial Data Processing: The "Predict Parameter 

Value" tool of GeneSpring was used with each of the training and test sets to 
generate predictions of histopathology classifications of the test sets. Unless 
otherwise specified a nearest neighbor setting of 10 (default) and P-value ratio cutoff 
of 0.5 was used. The number of genes used to predict was varied with standard 
numbers of 50, 40, 30, 20, 10, 5, 2 and 1 genes used. For each number of genes the 
numbers of correct calls, incorrect calls and non-calls were recorded. Non-calls are 
cases where no prediction was made because the P-value ratio exceeded the 
specified P-value ratio cutoff Calculations were made for overall percent correct calls 
(number of correct classifications/number or samples), percent correct calls of called 
samples (number of correct classifications/number of samples with calls) and percent 
of called samples (samples with calls/number of samples). 

[220] For each input list and optimal number of predictive genes (lowest number of 

genes giving a maximum overall percent of correct calls) additional information was 
recorded that included the list of specific genes in the optimum predictive set. 
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[221] (E) Results: Expression array data were first examined for the existence of 

genes whose expression correlated with histopathology scores. Materials and 
Methods of Example 1 presents a list of the compounds and dose levels along with 
the kidney histopathology classification and histopathology severity scores used for 
this analysis. For each distance measure the probability was adjusted in increments 
of 0.05 until at least 50 correlating genes were obtained. Lists of correlating genes 
were obtained using the distance measures described in Materials and Methods. 
Example sets of correlating genes are provided in Tables 25-26. The correlating 
gene lists as well as the entire array gene list were provided as input lists to the 
GeneSpring Predict Parameter value tool (described in Materials and Methods) that 
employs a K-means nearest neighbor {knn) predictive model. These lists as well as 
the entire array gene list were used for each of the six training and test sets defined 
in Materials and Methods o generate predictions of histopathology classifications of 
the test sets. Input genes for the Predict Parameter Value feature included all 700 
genes in the GenePix file (the Rat CT Array) as well as smaller lists of genes whose 
expressions correlated with histopathology by the correlation measures described 
previously. The number of genes used to predict are varied with standard numbers 
of 50, 40, 30, 20, 10, 5, 2 and 1 genes used. The specified number of predictive 
genes was varied to obtain an optimum number of predictive genes. 

[222] After this was done for all 6 training and test sets, all gene lists were then 

merged to create one aggregate list of predictive genes. Each gene on this 
aggregate list has predictive value for at least one of the training and test sets 
because it was observed to contribute to an optimum predictivity for a specific 
training/test set. The aggregate list was subdivided into smaller lists of genes based 
on the number of times a gene was predictive for an individual training or test set. 
For example, if 6 training and test sets were used, genes that were predictive in all 6 
training and test sets were designated as Combo (combination) 6. Genes that were 
predictive in only 5 of 6 training and test sets were designated as Combo 5, etc. 

[223] A list of predictive genes organized by their occurrence in the separate 

training and test sets is presented in Table 27. 
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[224] 



Example 6: Predictive Properties and Evaluation of Predictive Genes from 72 



Hour Expression Data: (A) Materials and Methods: The Database used was as 
described in Example 1. Array data, normalization procedures and transformations 
used in these analyses are as described in Example 1. Table 40 presents 72 hour 
gene expression data for the predictive genes. These data can be used with a k- 
means nearest neighbor prediction model (as available in GeneSpring or other 
statistical software packages) to make predictions as described in this example. The 
Predict Parameter Values tool in GeneSpring™ software was used for kidney toxicity 
class prediction. A description of this tool and the statistical procedures used is 
provided in Example 1. The training and test data sets used are those described in 
Example 1 . 

[225] (B) Kidney Toxicology Classification: Kidney toxicology classifications used 

are described in Example 1. In this analysis randomized classifications (same 
number of "yes" and "no" classifications distributed randomly among the samples) 
were used. 

[226] (C) Prediction Output and Initial Data Processing: For each gene list 

prediction used for evaluation a table of data generated by the Predict Parameter 
Values tool in GeneSpring™ software was saved which provided for each sample in 
the test set the actual call ("yes" or "no" for kidney toxicity), the predicted call ("yes", 
"no" or no call for kidney toxicity) and the P-value cutoff ratio. This set of data was 
used to calculate predictive performance measures provided below. 

[227] (D) Prediction Measures: Measures of prediction used for these analyses are 

generally accepted prediction measures for information about actual and predicted 
classifications done by a classification system (Venables and Ripley, ibid and Kubat 
and Matwin, ibid). Results from predictions of a two class case can be described 
above. 

[228] (E) Results: Prediction results for 72 hour expression data using genes 

identified as predictive and the predicting unit is compound-dose are presented in 
Table 28. This prediction unit is probably the most relevant for toxicology prediction. 
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The performance of the genes in predicting compound-dose toxicity is even better 
than predictions on an individual animal basis. 



accuracy exceeded 0.85 (85% accuracy) for the entire predictive gene list (Combo 
All) and 0.8 (80% accuracy) for the Combo 6 and 4 subsets. False positive prediction 
rates were generally low for Combo All (mean less than 0.1 ) as well as the other 
Combos except Combo 2 (means 0.1 38 - 0.228). Because the proportion of 
negative classifications was much higher than the proportion of positive (toxic) 
classifications in these sample sets the false negative rates would be expected to be 
higher than the false positive rates and this was observed to be the case. The 
geometric mean was used as an indication of predictive performance that includes 
consideration of the proportion of positive and negative classifications. Combo All, 
Combo 6, Combo 5 ( and Combo 4 gave geometric mean measures >0.6. 

[230] Example 7 Alternate Models for Predicting Kidney Toxicity: (A) Materials and 

Methods: The database used for evaluation of these models was the 24 hour 
expression data for kidney samples described above. Expression data was for the 
Combo 6 set of predictive genes as described herein. Due to heteroscedasticity (i.e., 
the variance increases proportionately more than the mean increases) of the gene 
expression ratio data, a log transformation of the data is often considered. In general 
untransformed data was used but for some models log transformed data was used 
for comparison. Six training and testing sets were used that are the same as 
described in Example 1 . 

[231] (B) Predictive Modeling: The predictive task with the kidney toxicology gene 

expression data is a two-class classification problem, where the two classes of 
possible responses are defined by either kidney toxicity histopathology {yes) or 
absence of kidney toxicity histopathology (no). This is an uneven class problem in 
that the class of yes responses is roughly 20 percent of the data or less in the 
database tested. A discrimination function is used to classify a training set. This 
function is cross-validated with a testing set, often repeatedly to quantify the mean 



[229] 



These data indicate a high accuracy in predicting kidney toxicity. Mean 
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and variation of the classification error. There are numerous common discrimination 
functions, and a comparative study of the performance of these functions is useful in 
determining the best classifier. Additional measures are then used to compare the 
performance of the classifiers. Since the classes are of significantly uneven sizes, 
use a geometric mean measure {GMM) was used to compare models, namely, the 
square root of the product of the true positives and the true negatives. 

[232] Common discrimination methods are Fisher's linear discriminant, quadratic 

discriminant (mahalanobis distance), /c-nearest neighbors {knn), logistic discriminant 
(MacLachlan, 1992), classification trees (or more generally known as recursive 
partitioning) (Breiman et al., 1984; Clark and Pregibon, 1993; Quinlan and Kaufman, 
1988), and neural network classifiers (Ripley, 1996). Most are formula-based such 
as linear and quadratic discriminant, whereas others are rule-based, such as 
recursive partitioning, or algorithmically based, such as knn. knn is also database 
dependent in that a database containing training set is needed to perform nearest 
neighbor search and classification. 

[233] (C) Classifier Models: A variety of common classification techniques were 

evaluated. As an extension of the k-means nearest neighbor {knn) model a simple 
hybrid classifier was designed and tested, using the knn results, to transform the knn 
model into a database independent model. This model is termed a centroid model. 
The centroid model uses the correctly identified test data results from knn and 
locates a centroid of the subset of k samples that are of the same class for each 
correctly identified test sample. The centroid is assigned the correct class, and with 
new test data, a sample is assigned the class of its nearest centroid. 

[234] In addition to the knn and centroid models described above, tree, centroid, 

logistic, and neural network models were employed. The neural network is a simple, 
feed-forward network, allowing skip layers, and with an entropy fitting criterion. 
Linear classifiers perform poorly with respect to this data and quadratic classifiers 
perform modestly, so their results are not presented. 
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[235] (D) Cross Validation of the Models: Six training and testing sets were used to 

cross- validate knn. Gene selection ranking was then performed on each training set. 
A number of different gene sets were used for each of the six sets and the best GMM 
value was chosen to represent the performance of the model. Trees were pruned via 
ten-fold internal cross-validation, (i.e., using subsets of the training set) for each 
training set, and then the tree was used to predict the testing set. A GMM was thus 
calculated for each testing set. Trees perform the gene selection via pruning, and 
anywhere from one to five genes were selected for each tree. The centroid model is 
five-fold cross-validated using random subsets of the testing set. The mean of the 
GMM of each of the validation runs is used as the performance measure. The top 
five discriminating genes are used in the centroid models. The logistic discrimination 
uses a stepwise backwards selection process to determine the gene set during the 
training phase. Three to six genes are typically selected via this process. A single 
performance is then obtained using the corresponding testing set. A neural network 
is trained on each training set and then validated on the corresponding testing set. 
All 28 genes in the data set are used with the neural network model. 

[236] (E) Results: Model performance is presented in Table 29. The knn model 

performed the best overall. If the best common gene selection is used, knn is still the 
best, though the performance mean is more in line with the logistic and centroid 
models. Logistic and centroid models perform the next best overall, and either could 
be used successfully with a less than 25 percent misclassification error, if a database 
independent solution is preferred. Log transformations of the data produced mixed 
results when used with logistic and neural network models, suggesting that such a 
transformation has little impact. Tree and neural network models perform the poorest 
respectively on average; however, all of the models perform well for this type of data 
on at least some of the training and testing pairs, with the equivalent of a less than 20 
percent misclassification error. The knn, centroid and neural network models could 
be improved by a more thorough gene selection scheme. 

[237] Table 30 presents logistic discrimination coefficients derived from this 

analysis. These coefficients may be used in a logistic discriminant model to obtain 
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predictions of kidney toxicity when expression\values for the indicated genes are 
determined using appropriate samples and an appropriate microarray expression 
detection system such as the Rat CT array used to develop the Database. 

[238] Similarly, the classification model for all of the data using a classification tree 

in S-Plus software provided the following rule for predicting toxicity: if Gadd45 < 
1.474 AND Tissue inhibitor of metalloproteinases 1 < 1.786, then "No" (not toxic), 
otherwise "Yes" Toxic. 

[239] For this model and rule, the internal performance with the entire database 

was a total 7 of 241 samples were misclassified, with a misclassification error 0.03. 
A total of 2 of 38 of the yes class (toxic) are misclassified and 5 of 203 no class (not 
toxic) are misclassified. This is equivalent to a 0.053 and 0.025 misclassification 
error, respectively. The geometric mean performance measure is 0.961267. This 
model rule can be applied to obtain predictions of kidney toxicity when expression 
values for the indicated genes are determined using appropriate samples and an 
appropriate microarray expression detection system such as the Rat CT array used 
to develop the Database. 
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[240] Example 8 Use of Predictive Genes to Predict Kidney Toxicity for Samples 

External to the Database: (A) Materials and Methods: (A)(1) Animal Treatment and 
Tissue Harvest: Male Sprague-Dawley rats in groups of 3 were treated by 
intraperitoneal injection with test compounds (cephalosporidine, 1500 mg/kg and 
cisplatin, 20 mg/kg) or only with the vehicle in which the compound was mixed. At 
specified timepoints (6h and 24h) the rats were euthanized and tissues collected. 
Kidney tissues were immediately placed into liquid nitrogen and frozen within 3 
minutes of the death of the animal to ensure that mRNA did not degrade. The 
tissues were sent blinded to be evaluated. The organs/tissues are then packaged 
into well-labeled plastic freezer quality bags and stored at -80 degrees until needed 
for isolation of the mRNA from a portion of the organ/tissue sample. 

[241] (A)(2) Gene Expression Measurement: Isolation of RNA, preparation of 

cONA labeled probes and hybridizations procedures were as described in Example 1 
Materials and Methods. Probes were hybridized to the rat CT Chip which is the same 
array as used for the database. 

[242] (B) Data Analysis: Array data from the samples was loaded into GeneSpring 

software using the same procedures as used for the database. No kidney toxicity 
parameters were entered for these samples. The Predict Parameter Value tool was 
used to make toxicity predictions using different Combo Gene sets from the 24 hour 
data and the entire database as the training set. Other values used were 1 0 nearest 
neighbors and a p-value ratio cutoff of 0.5. 

[243] (C) RESULTS: Table 31 presents predictions for samples that were external 

to the database used to derive the predictive genes. The samples were kidney 
samples from replicate animals treated with cephaloridine and cisplatin. One of 
these compounds (cisplatin) is also represented in the database (at a different dose 
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level) and the other compound, cephaloridine, is not in the database. Histopathology 
conducted on the kidney samples verified that these treatments induced kidney 
tubular necrosis. Each of the Combo gene sets correctly predicted that these 
samples had expression patterns indicative of kidney toxicity. 

[244] These results demonstrate clearly that the discovered sets of predictive 

genes in conjunction with the database and K-means nearest neighbor model can 
accurately predict toxicity from microarray data that is external to the database. 
Because the database consists mostly of non-toxic samples the prediction of toxicity 
for these samples is significantly different from what would be expected from chance. 
It is also noteworthy that three different sets of predictive genes are capable of 
making accurate predictions. 

[245] Example 9 Clustering Analysis to Identify Coordinantly Behaving Subset of 

Predictive Genes 

(A) Materials and Methods 

(A)(1) Gene Expression Data: Gene expression data used for cluster analysis 

were the 24 hour kidney expression data of the 28 genes of the Combo 6 

predictive gene set. These data are shown in Table 39. 

[246] (B) Cluster Analysis: Cluster analysis tools used in these analyses included 

K-means and gene tree features of GeneSpring software and Wards clustering 
algorithm in S-Plus statistical analysis software. 

[247] (C) Results: Figure 7 presents combined results of K-means and gene-tree 

hierarchical clustering analysis. Combo 6 (28 genes) was clustered using K-means 
(number of cluster 10, maximum iteration 100, similarity measure Pearson) and Gene 
tree (separation ratio 0.5, minimum distance 0.001 , similarity measure Pearson). The 
k-means clusters are colored according to the corresponding set 1 to set 10. The 
gene names on the display from top to bottom correspond to left to right cluster bars. 

[248] Wards cluster analysis results are shown in Figure 8. Cluster tree for Combo 
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6 genes are shown with the best cut line indicating 7 clusters. Gene names 
corresponding to numbers are indicated in tabular form below the diagram. 



Program Product to Predict Renal Toxicity 
(A) Materials and Methods 

(A1) Overview of Computer Program Product: A computer program product 

produces a prediction of the occurrence of a kidney toxicity using input gene 

expression data from test samples. The model and data for the computer 

program have been primarily validated using Phase-1 Rat CT arrays and Phase- 

1 Rat CT expression data in the Phase-1 TOXBank database as described in 

previous examples. In other embodiments, expression data from other 

expression platforms (such as TaqMan using Syber Green technology) may also 

be used in the computer program product. Those skilled in the art are capable of 

developing and validating scaling factors to adjust for differences in differential 

gene expression sensitivity and responsiveness among different platforms used 

in the computer program product. 

The computer program product uses the Predictive Model as described in the 
previous examples. The computer program product contains an encrypted training 
data set that includes differential gene expression values and an endpoint 
classification for each sample in the training set. The computer program product 
samples are from the same timepoint (e.g., gene expression measured at 24 hours 
after dosing) and the classification is binary for the specific endpoint (e.g., kidney 
tubular necrosis or no kidney tubular necrosis). The computer program product also 
contains encrypted lists of the Combo sets of predictive genes (also called 
Predictagen sets). Inputs to the Predictive Model of the computer program product 
are the k value for number of nearest neighbors and the type of distance measure to 



[249] 



Example 10. Use of Expression Profiles of Predictive Genes in a Computer 
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be used in the model. Data inputs for the Predictive Model include the Combo list(s) 
of predictive genes and training set as encrypted "plug-in" files and specification of a 
test data file(s) that has expression data. 

[251] The initial prediction is made after calculating the probability that the 

tabulated votes are different from the proportion of votes in the training set for each 
classification. A statistical test (hypergeometric mean distribution) is run for each 
classification and p-values are calculated. The classification prediction would be that 
class that has the highest p-value. A classification cutoff procedure is used that 
uses the p-value ratio (1 - po/pi where p 0 is the p-value for the not predicted class 
and pi is the p-value for the predicted class). If the p-value ratio does not exceed a 
specified cutoff value (input to the computer program product by the user) then a 
prediction is not made. The Prediction Machine can be used with multiple 
Predictagen sets with the classifications, p-values and p-value ratios calculated as 
above. In this case an overall prediction is made by combining the predictions of the 
individual Predictagen sets. Each Predictagen set is weighted by a performance 
number. The overall certainty for this combined prediction is calculated by a paired 
value Mest using the p-value ratio and (1 -p-value ratio) for each Predictagen set as a 
pair of values. The certitude is 1-p where p is the value for the paired value Mest. 

[252] (A2) Computer Program Product Input: Encrypted training data is included as 

a plug-in module for the software. User input includes specification of encrypted 
Predictagen gene lists and samples for prediction (files with gene expression data). 
Additional specifications are distance measure to be used in the knn model (currently 
Euclidean), number of neighbors and a certitude cutoff (p-value ratio cutoff). 

[253] (A3) Program Operation: The program is executed as follows. First, on the 

Prediction tab the 'Load Predictagens' button is clicked on to load the desired 
predictagen(s). In this example, the 24 hour kidney Predictagen is loaded. Next, a 
predictagen in the Predictagen sets list box is highlighted and the 'Make Predictor* 
button is clicked on (in this example, 24 hour kidney). If necessary, the predictor is 
highlighted and the 'Configure' button is clicked on to set parameter values. Next, the 
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'Load Samples' button is clicked on. Sample data is loaded as text files in the format 
shown in Table 44. Samples from the Samples list box using the left mouse button 
are then selected, and the CTRL key is simultaneously selected to make multiple 
selections. In this example, 3 kidney samples from rats treated with 25 mg/kg 
paraquat and 3 kidney samples from rats treated with 80 mg/kg phenobarbital are 
selected. The samples were treated and processed for gene expression analysis as 
described in the previous examples. The 'Add to predictor 1 button is then clicked on, 
and the Predict* button is then clicked on to generate the program's output. 

[254] On the Output tab, the 'Summary', 'Detail', or 'Full' radio buttons are selected 

to control the amount of information displayed about the prediction. The Tabular 
Report' checkbox is checked to put the output in a format that can be loaded into 
Excel as tab-delimited text. The 'Save', 'Copy*, Print', and 'Clear 1 buttons are 
selected to save the output, copy the output to the clipboard, print the output, or clear 
the output window prior to another prediction. 

[255] (A4) Computer Program Product Output: The summary view displays 

sample information, the call (kidney tubular necrosis or negative), and the overall 
certitude. The detail view presents the individual calls and 1-p-value ratio for each 
Predictagen, in addition to summary view information. The full view presents, for 
each sample and Predictagen gene list, the specific nearest neighbors and their 
classification (votes) along with the hypergeometric mean p values for each 
classification. At the end of this information detail view information is presented. 

[256] (B) Test Data: Table 43 displays the test set of gene expression data used 

to generate predictions. The table shows the correct classification of kidney samples 
that have histopathology (kidney tubular necrosis) or no histopathology. 

[257] (C) Results: Table 42 displays the summary output of the computer program 

after loading. Two out of three of the paraquat samples (sample #s 16477 and 
16479) were correctly predicted for rat kidney tubular necrosis (with certitudes of 
0.472 and 0.796). Three out of three of the phenobarbital samples were correctly 
predicted as negative for kidney tubular necrosis. Table 43 displays the detailed 
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output of the computer program, which shows the individual performances of the 24 
hour kidney Combo sets and the overall certitude score. 

[258] Example 1 1 Selection and Validation of Protein Biomarker Candidates. 

Protein marker candidates can be selected from biomarker genes using a number of 
parameters. Table 44 presents biomarker genes sorted in order of their mean 
individual gene predictive performance (percent correct calls) for all genes exhibiting 
£ 60% percent correct calls. Each gene was then evaluated for evidence whether it 
codes for a protein. This is clearly a key criterion for a protein marker. The next 
parameters evaluated were the relative transcriptional response in toxic versus non- 
toxic samples. If protein levels are proportional to RNA levels then these columns 
indicate the relative potential magnitude of the protein marker in toxic and non-toxic 
samples. The better marker candidates should be those genes exhibiting the larger 
differences in tiNA expression. A number of additional criteria can be considered 
included protein MW, occurrence of the protein in tissues other than the target tissue 
and availability of antibodies which will recognize the protein. One important criterion 
may also be whether the protein is secreted. The last column in Table 44 indicates 
that 3 of the proteins are known to be secreted. Table 37 lists proteins known to be 
secreted derived from the total list of predictive genes. The property of secretion may 
be useful in identification of proteins which could be biomarkers in serum or possibly 
other matrices such as urine or saliva. 

[259] Protein markers can be rapidly evaluated by testing for levels of the identified 

marker candidates using any of a number of analytical techniques for measuring 
specific protein levels such as Western blots or ELISA assays. Samples for analysis 
may be selected from a tissue bank such as that described in Example 1. Selection 
for analysis would include samples from toxic treatments and samples from non-toxic 
treatments. Quantitative protein marker data can be analyzed using the same 
approaches described in Example 2 for evaluation and validation of predictive 
performance of the protein markers. 

[260] Experimental data demonstrating application of this concept and 
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identification and validation of a protein marker were developed using antibodies to 
clusterin and insulin-like growth factor binding protein 1. These genes were selected 
from the list of genes on Table 44 based on available antibodies. Insulin-like growth 
factor binding protein 1 is known to be secreted. Serum sample protein from four 
pairs of animals (2 pairs treated with non-toxic compounds and 2 pairs treated with 
kidney-toxic compounds were analyzed using Western blot methods known to those 
skilled in the art. The Western blot was probed with antibodies to insulin-like growth 
factor binding protein 1 and clusterin. 

[261] A scanned autoradiogram of results is presented in Figure 9. Clusterin 

appeared to be approximately equal abundance in the samples. Insulin-like growth 
factor binding protein 1 protein levels clearly appeared to be proportional to the gene 
expression levels observed in kidneys of these animals and were clearly elevated in 
the kidney-toxic treatments compared to the non-toxic treatments. The insulin-like 
growth factor binding protein 1 protein levels in serum were correlative at the 
individual animal level with the transcription factor signals. These data clearly 
indicate that predictive markers identified through transcript measurement and 
analysis can also be predictive protein markers. 

[262] It is understood that the examples and embodiments described herein are for 

illustrative purposes only and that various modifications or changes in light thereof 
will be suggested to persons skilled in the art and are to be included within the spirit 
and purview of this application and scope of the appended claims. All publications, 
patents and patent applications cited herein are hereby incorporated by reference in 
their entirety for all purposes to the same extent as if each individual publication, 
patent or patent application were specifically and individually indicated to be so 
incorporated by reference. 
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Table 1 Compounds, Dose Levels, Kidney 
Pathology and Abbreviations in the database 


Compound 


Dose Level 


Abbreviation 


Kidney* 
Tubular Necrosis 


Score** 


1-naphthylisothiocyanate 


15mgkg 


ANTT 15 


no 


1 


1-naphthylisothiocyanate 


60mgkg 


ANTT60 


no 


1 


5-fluorouracil 


13 mg/kg 


5-FU13 


no 


1 


5-fluorouracil 


50 mg/kg 


5-FU 50 


no 


1 


acetaminophen 


250 mg/kg 


APAP 250 


no 


1 


acetaminophen 


1000 mg/kg 


APAP 1000 


no 


1 


amphotericin B 


5 mg/kg 


AMPB5 


no 




amphotericin B 


20 mg/kg 


AMPB 20 


no 




azathioprine 


50 mg/kg 


AZA 50 


no 




azathioprine 


200 mg/kg 


AZA 200 


no 


1 


benzene 


0.25 ml/kg 


BEN 250 


no 


1 


benzene 


1 ml/kg 


BEN 1000 


no 




benzo[a]pyfene 


30 mg/kg 


BAP 30 


no 




bromobenzene 


0.2 ml/kg 


BRB 200 


no 




bromobenzene 


0.8 ml/kg 


BRB 800 


no 




busulfan 


14 mg/kg 


BUS 14 


no 


1 


cadmium chloride 


1 mg/kg 


CADI 


no 


1 


cadmium chloride 


2 mg/kg 


CAD 2 


no 


1 


cadmium chloride 


4 mg/kg 


CAD 4 


yes (6h) 




carbon tetrachloride 


0.25 ml/kg 


CCL4 250 


no 


1 


carbon tetrachloride 


1 ml/kg 


CCL4 1000 


no 




carmustine 


16 mg/kg 


CAR 16 


no 


1 


chloroform 


0.25 ml/kg 


CHCL3 250 


yes 




chloroform 


0.5 ml/kg 


CHCL3 500 


yes 




chlorpromazine 


8 mg/kg 


CHLOR 8 


no 


1 


chlorpromazine 


30 mg/kg 


CHLOR 30 


no 


1 


cisplatin 


2.5 mg/kg 


CIS 2.5 


yes 




cisplatin 


10 mg/kg 


CIS 10 


yes 




ciofibrate 


75 mg/kg 


CLO 75 


no 


L. 


clofibrate 


250 mg/kg 


CLO 250 


no 


} 


clozapine 


45 mg/kg 


CLOZ 45 


no 




clozapine 


180 mg/kg 


CLOZ 180 


no 


1 ... 


carboxy methyl cellulose 


30 mg/kg 


CMC 30 


no 


1 


cycloheximide 


0.5 mg/kg 


CrlbX O.j 


no 




cycloheximide 


2 mg/kg 


CHEX2 


no 




cyclophosphamide 


25 mg/kg 


CPHOS 25 


no 




cyclophosphamide 


100 mg/kg 


CPHOS 100 


yes 




cyclosporin A 


20 mg/kg 


CYCA 20 


no 




cyclosporin A 


80 mg/kg 


CYCA 80 


no 




dexamethasone 


8 mg/kg 


DEX8 


no 




dexamethasone 


30 mg/kg 


DEX 30 


no 




diflunisal 


25 mg/kg 


DIF 25 


no 




diflunisal 


100 mg/kg 


DIF 100 


no 




dimethylnitrosamine 


20 mg/kg 


DMN 20 


no 
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fifixon lHi ci n 


12 me/ke 


DOX 12 


no 




prvthrrvmvHn p<itnlat£ 


40 me/ke 


ERY 40 


no 


j 


ervthromvcin estolate 


160 me/ke 


ERY 160 


no 


! 


estradiol 


0.1 me/ke 


EST 0.1 


no 




pcfmHinl 


0 4 me/ke 


EST 0.4 


no 


I 


cuiaiiui 


2 5 ml/ke 

**l**J 1 11-1/ fv£ 


ETH 2500 


no 




tmnpvplfwir 

^OllL'jWlLr VII 


50 me/ke 


GAN 50 


no 


j 


£aiji>y i<i\jvii 


200 me/ke 


GAN 200 


yes 


3 


ftATitanrtioin 


me/ke 


GEN 38 


no 




gcniaiiuv in 


1 SO me/ke 


GEN 150 


no 




11 yuruA y urea 


2*50 me/ke 


HYD250 


yes 




Li yuruA yurca 


1000 me/ke 


HYD 1000 


yes 




isoinazia 


SO me/kcr 


ISON 50 


no 




lSOnia£IU 


900 me/ke 


ISON 200 


no 




KClUCUIlaZUlC 


90 me/ke 


KETO 20 

XVI < A V> X> V/ 


no 






ftfl me/ke 


KETO 80 


no 


1 


iipopoiysaccndnuc 


9 mo/kcr 


LPS 2 


yes 




lipupUiyaaLCnallUC 


R ma/ko- 

O IIlg/Kg 


LPS 8 


yes 




me uj uirc Aaic 


l.J lllg/Ag 


MET 1.3 


no 




ETicin OlTCAaie 


*5 m cr/kcT 
J ITlg/A.g 


MFT 5 


no 




naloxone 




NAT 45 


no 


1 


nolftv Art 0 

naloxone 


1 RO me/ke 


NAL 180 


no 


l 


phenobarbi tal 


90 tno/Vo 
zu rug/ Kg 


PRARR 20 


no 


.... 


pnenoDaTDuui 


RO ma/ko 
OU IIlgrKg 


PRARR 80 


no 




pncnyinyuraz.inc 


90 ma/Vo 
xu nig/ Kg 


PHEN 20 


no 


1 


pnenyinyuraxine 


SO me/ke 

ou lllgf JVg 


PHEN 80 

x xxx_«x^ y v 


no 


1 


poiycinyicnc giyvui 


J 1111/ Kg 


PEG 5000 


no 




puromycin 


me/ke 

JO Hlg/Jlg 


PUR 38 


no 




puromycin 


1 SO ma/ko 

iJU XXlg/Ag 


P1IR 150 


no 




L^uiniuinc 


95 tntr/ke 

XJ Illg/Kg 


OIJTN 25 


no 




quinidine 


100 mg/kg 


QUIN 100 


no 


1 


streptozotocin 


20 mg/kg 


STRZ 20 


no 




streptozotocin 


75 mg/kg 


STRZ75 


no 




tamoxifen 


50 mg/kg 


TAM 50 


no 




tamoxifen 


200 mg/kg 


T AM 200 


no 




tetracycline 


50 mg/kg 


TET50 


yes 




tetracycline 


150 mg/kg 


TET150 


yes * 




theophylline 


25 mg/kg 


THEO 25 


no 




theophylline 


100 mg/kg 


THEO 100 


no 





* Values in parentheses indicate that array data are only available for indicated time 
points 

** Histopathology tubular necrosis severity scores. 1= not remarkable; 2 and higher 
indicate histopathology of increasing severity 
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Table 2 List of Genes, Whose Expression at 24h Directly Correlates with Kidney Tubular 
Necrosis at 72h, Ranked by Pearson Correlation Coefficient 



Gene 


Correlation Coefficient 


Gaddl53 


0.692123 


Gadd45 


0.6542049 


nculin-Hlce growth f ac tor bin din 2 OTOtei n 1 


0.6465685 


5 AR interacting nrote in 

, /xxx itiiwicivijiig ^/iv/Lwtii 


0.6218616 


RCT-144 


0.6188912 


falnnotin T Vipaw phain 


0.610469 


Ti^^ne inhibitor of metal lonroteinases- 1 


0.5927494 


rihnQomal nrntein T £\ /'alternate clone 1^ 

\j\J\j 1 lUlsoxslllCU 1/JUlvlll l—i\J yoilvtliuiv viviiv */ 


0.5900929 


Ixv^ 1 -VJO 


0.5799504 




0.5752138 


RCT-4Q 


0.5744045 


aF l^itiHino' nrotPin 


0.5633063 


"^unpin licht chain 1 


0.561974 


flnctprin 


0.5537873 


fift^I riHo^nmal nrotein L6 


0.5526743 


rntf*r1f»nVin-1 Heta 


0.5508332 


r^nthpricin T Qfifliipnpf* 2 


0.5458164 


^hiiTvrnx irfp rfi*nniitase ^4n 

ij U pwl U Al Uw VA1 OI 1 lUluow itaii 


0.5432356 


Matrix metallonroteinase-1 


0.5432082 


Ribosomal nrotein S8 


0.5429754 


RCT-274 


0.5399542 


RCT-179 


0.5396944 


Ubiquitin D (Ubd) 


0.5390609 


Thymosin beta-10 


0.5375005 


Multidrug resistant protein- 1 


0.5359658 


Ribosomal protein S9 


0.5295026 


Uncoupling protein 2 


0.5272409 


Multidrug resistant protein-3 


0.5255124 


Beta-tubulin, class I 


0.5235234 


RCT-145 


0.5214936 


CD44 metastasis suppressor gene 


0.521281 


RCT-109 


0.5141034 


Alpha-tubulin 


0.5105499 


Ribosomal protein L13A 


0.5068002 


Zinc finger protein 


0.4949505 


Ferritin H-chain 


0.493831 


RCT-50 


0.4927958 


RCT-198 


0.483781 


RCT-158 


0.4823461 
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c-myc 


0.4734444 


RCT-60 


0.4707905 


Beta-actin, sequence 2 


0.4689375 


Canalicular multispecific organic anion transporter 


0.459423 


MHC class I antigen RTl.Al(f) alpha-chain 


0.458286 


Calgranulin B 1 


0.4560673 


Osteopontin 


0.4508689 


Complement component C3 


0.4491239 


Ubiquitin conjugating enzyme (RAD 6 homologue) 


0.446513 


RCT-152 


0.4463049 


Alpha-fibrinogen 


0.4461847 


RCT-293 


0.4419801 


Organic cation transporter 3 


0.4411987 


Keratinocyte growth factor 


0.4402586 


RCT-24 


0.4377164 


RCT-18 


0.4342767 


RCT-241 


0.4299609 


RCT-138 


0.4268714 


DNA topoisomerase I 


0.4262425 


RCT-149 


0.4230694 


RCT-192 


0.4214455 


RCT-127 


0.4187711 


RCT-126 


0.4119079 


RCT-258 


0.41 12586 


RCT-91 


0.4109416 


Ceruloplasmin 


0.402974 


Vacuole membrane protein 1 


0.400575 
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Table 3 List of Genes, Whose Expression at 24h Inversely Correlates with Kidney 
Tubular Necrosis at 72h, Ranked by Spearman Correlation Coefficient 





Correlation 
Coefficient 


RCT-42 


-0.25083 


Membrane bound cytochrome b5 


-0.25275 


RCT-132 


-0.25352 


RCT-99 


-0.25374 


Four repeat ion channel 


-0.25412 


RCT-62 


-0.25524 


RCT-137 


-0.25548 


AT-1 


-0.25881 


UDP-glucuronosyltransferase 2B 


-0.26029 


Calgranulin B4 


-0.26618 


Methylacyl-CoA racemase alpha 


-0.26791 


CyclinDl 


-0.27006 


Organic anion transporting polypeptide 1 


-0.27038 


Cystatin C 


-0.27304 


Matrin F/G 


-0.27305 


RCT-181 


-0.27455 


RCT-25 


-0.27625 


RCT-143 


-0.27626 


RCT-93 


-0.28389 


Protein tyrosine phosphatase alpha 


-0.2842 1 


RCT-79 


-0.28485 


Caspase 2 


-0.28686 


Vascular endothelial growth factor 


f\ Anil 

-0.287 1 6 


Glutathione S-transferase Ya 


-0.28785 


Senescence marker protein-30 


-0.29 1 92 


RCT-178 


-0.29272 


Organic anion transporter Kl 


-0.29329 


RCT-256 


-0.2943 


25-DX 


-0,29444 


RCT-22 


-0 79564 


Sarcoplasmic reticulum calcium ATPase 


-0.2974 


RCT-280 


-0.29749 


RCT-148 


-0.30758 


Arginosuccinate synthetase 1 


-0.30894 


RCT-142 


-0.31028 


RCT-260 


-0.31039 


Apoptosis-regulating basic protein 


-0.31798 


Organic anion transporter 3 


-0.32302 


Ornithine aminotransferase 


-0.32748 


Hemoglobin alpha 1 chain (alternate clone) 


-0.33449 
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Cytochrome P450 2A3 


-0.33951 


Hemoglobin alpha 1 chain 


-0.34347 


Selenoprotein P 


-0.34685 


Cytochrome P450 2C23 


-0.34696 


Pancreatic secretory trypsin inhibitor type II (PSTI-I1) 


-0.34712 


RCT-38 


-0.34982 


Iron-responsive element-binding protein 


-0.3572 


RCT-10 


-0.36278 


Epidermal growth factor 


-0.36487 


Sodium/glucose cotransporter 1 


-0.36594 


Calgranulin B2 


-0.36604 


Cytochrome c oxidase subunit II 


-0.36678 


RCT-89 


-0.37036 


Acyl-CoA dehydrogenase, medium chain 


-0.37526 


RCT-39 


-0.37793 


RCT-34 


-0.37992 


Malate dehydrogenase, cytosolic 


-0.38206 


D-dopachrome tautomerase 


-0.38497 


RCT-87 


-0.3857 


Pancreatic secretory trypsin inhibitor type II (PSTI-II) (alternate clone) 


-0.40004 


RCT-101 


-0.40144 


RCT-69 


-0.40543 


Thiopurine methyltransferase 


-0.41035 


Very long-chain acyl-CoA synthetase 


-0.41248 


Fatty acyl-CoA oxidase 


-0.42391 


RCT-287 


-0.4351 


Dimethylarginine dimethylaminohydrolase 


-0.4413 


RCT-182 


-0.44238 


RCT-291 


-0.4606 


3-hydroxyisobutyrate dehydrogenase 


-0.48712 
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Table 4 Distribution of Compounds* in Individual Training and Test Sets 
for 24 Hour Kidney Data 



Training and Test Set A 



Training Set A 
Negative** 


Training Set A 
Positive 


Test Set A 
Negative 


Test Set A 
Positive 


AMPB 


CIS 


ANIT 


CHCL3 


AZA 


HYD 


5-FU 


CPHOS 


CAD 


LPS 


APAP 


GAN 


CHLOR 


TET 


BEN 




CLO 




BAP 




CYCA 




BRB 




DEX 




BUS 




DIF 




CCL4 




DOX 




Pad 




ERY 




CLOZ 




EST 




CMC 




ETH 




CHEX 




GEN 




DMN 




MET 




ISON 




PHEN 




KETO 




PUR 




NAL 




TAM 




PBARB 




TET 




PEG 








OUIN 








STRZ 








THEO 





Training and Test Set 1 



Training Set 1 Negative 


Training Set 1 
Positive 


Test Set 1 Negative 


Test Set 1 Positive 


AMPB 


CPHOS 


5-FU 


CHCL3 


ANIT 


GAN 


APAP 


CIS 


AZA 


LPS 


BEN 


HYD 


BAP 


TET 


BRB 




CAD 




BUS 




CAR 




CLOZ 




CCL4 




CMC 




CHEX 




DIF 




CHLOR 




DMN 




CLO 




DOX 




CYCA 




ERY 




DEX 




ETH 
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EST 




NAL 




GEN 




PFO 




ISON 




PUR 




KETO 




STRZ 




MET 




TAM 




PBARB 








PHEN 








OUIN 








THEO 









Training and Test Set 2 



Trflinina Set 1 Negative 


Training Set 2 Positive 


Test Set 2 Negative 


Test Set 2 Positive 


AMPB 


CHCL3 


5-FU 


CPHOS 


APAP 


CIS 


ANIT 


LPS 


AZA 


GAN 


BRB 


TET 


BAP 


HYD 


CAD 




BEN 




CHEX 




BUS 




CHLOR 




CAR 




CLOZ 




CCL4 




CMC 




CLO 




DEX 




CYCA 




DMN 




DIF 




GEN 




DOX 




NAL 




ERY 




PUR 




EST 




OUIN 




ETH 




STRZ 




ISON 




TAM 




KETO 




THEO 




MET 








PBARB 








PEG 








PHEN 









Training and Test Set 3 



Training Set 3 
Negative 


Training Set 3 
Positive 


Test Set 3 Negative 


Test Set 3 Positive 


ANIT 


CHCL3 


5-FU 


CPHOS 


APAP 


CIS 


AMPB 


LPS 


BEN 


GAN 


AZA 


TET 


BUS 


HYD 


BAP 




CAD 




BRB 




CAR 




CCL4 




CHLOR 




CHEX 




CLO 




CYCA 
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CLOZ 




DIF 




CMC 




DOX 




DEX 




ERY 




DMN 




GEN 




EST 




ISON 




ETH 




PBARB 




KETO 




PHEN 




MET 




PUR 




NAL 




STRZ 




PEG 








QUIN 








TAM 








THEO 









Training and Test Set 4 



Training Set 4 
Negative 


Training Set 4 
Positive 


Test Set 4 
Negative 


Test Set 4 Positive 


5-FU 


CHCL3 


AMPB 


CPHOS 


APAP 


CIS 


ANTT 


HYD 


BEN 


GAN 


AZA 


LPS 


CAR 


TET 


BAP 




CHEX 




BRB 




CHLOR 




BUS 




CLO 




CAD 




CLOZ 




CCL4 




CMC 




DEX 




CYCA 




ERY 




DIF 




EST 




DMN 




ETH 




DOX 




KETO 




GEN 




PBARB 




ISON 




QUIN 




MET 




TAM 




NAL 




THEO 




PEG 








PHEN 








PUR 








STRZ 









Training and Test Set 5 



Training Set 5 


Training Set 5 


Test Set 5 


Test Set5 Positive 


Negative 


Positive 


Negative 
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AZA 


CPHOS 


5-FU 


CHCL3 


BAP 


GAN 1 


AMPB 


CIS 


BRB 


HYD 


ANIT 


TET 


BUS 


LPS 


APAP 




CAR 




BEN 




CHEX 




CAD 




CHLOR 




ecu 




CLO 




CMC 




CLOZ 




DEX 




CYCA 




ERY 




DIF 




EST 




DMN 




ETH 




DOX 




GEN 




KETO 




ISON 




NAL 




MET 




PBARB 




QUIN 




PEG 




THEO 




PHEN 








PUR 








STRZ 








TAM 









* For abbreviations please see Table 1 (Compound, Dose, Abbreviation, etc.) 
** Negative= Compounds that did not elicit histopathology (score=l) 

Positive= Compounds that did elicit histopathology (score of 2 or greater) 
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Table 5 Predictive Genes for 24 Hour Expression Data 



Gene Name 


Combination 
Category* 


60S ribosomal protein L6 (alternate clone 1) 


6 


Alpha-tubulin 


6 


Calpactin I heavy chain 


6 


Cathepsin L 


6 


Cathepsin L, sequence 2 


6 


CDK108 


6 


Clusterin 


6 


c-myc 


6 


Dynein light chain 1 


6 


Gaddl53 


6 


Gadd45 


6 


Insulin-like growth factor binding protein 1 


6 


PAR interacting protein 


6 


RCT-109 


6 


RCT-144 


6 


RCT-145 


6 


RCT-152 


6 


RCT-158 


6 


RCT-198 


6 


Vacuole membrane protein 1 


6 


RCT-24 


6 


RCT-241 


6 


RCT-271 


6 


RCT-68 


6 


Ribosomal protein L13A 


6 


Ribosomal protein S8 


6 


Tissue inhibitor of metall ©proteinases- 1 


6 


Uncoupling protein 2 


6 


60S ribosomal protein L6 


5 


Alpha-fibrinogen 


5 


Beta-actin, sequence 2 


5 


Beta-tubulin, class I 


5 


Canalicular multispecific organic anion transporter 


5 


Carbonic anhydrase III, sequence 2 


j 


Heme binding protein 23 


5 


IgE binding protein 


5 


Keratinocyte growth factor 


5 


MHC class I antigen RTl.Al(f) alpha-chain 


5 


Multidrug resistant protein-3 


5 


Osteopontin 


5 


RCT-126 


5 


RCT-179 


5 
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RCT-182 


5 


Calgranulin Bl 


5 


RCT-258 


5 


RCT-274 


5 


RCT-49 


5 


RCT-50 


5 


RCT-60 


5 


Proliferating cell nuclear antigen gene 


5 


Ribosomal protein S9 


5 


Thymosin beta-10 


5 


Zinc finger protein 


5 


Preproalbumin, sequence 2 (alternate clone 1) 


4 


ATP-stimulated glucocorticoid-receptor translocation promoter (Gyk) 


4 


CD44 metastasis suppressor gene 


4 


Ceruloplasmin 


4 


Connexin-32 


4 


Epidermal growth factor 


4 


Ferritin H-chain 


4 


Hypoxanthine-guanine phosphoribosyltransferase 


4 


Interleukin-1 beta 


4 


Matrix metalloproteinase-1 


4 


Multidrug resistant protein- 1 


4 


Organic cation transporter 3 


4 


Pancreatic secretory trypsin inhibitor type II (PSTI-II) 


4 


RCT-138 


4 


RCT-180 


4 


RCT-240 


4 


RCT-287 


4 


RCT-293 


4 


RCT-38 


4 


Pyruvate kinase, muscle 


4 


Ref-1 


4 


Superoxide dismutase Mn 


4 


Ubiquitin conjugating enzyme (RAD 6 homologue) 


4 


Pancreatic secretory trypsin inhibitor type II (PSTI-II) (alternate clone) 


3 


Annexin V 


3 


Aspartoacylase 


3 


Calreticulin 


3 


Cathepsin S 


3 


Dimethylarginine dimethylaminohydrolase 


3 


Ecto-ATPase 


3 


Methylacyl-CoA racemase alpha 


3 


p53 


3 


RCT-10 


3 


RCT-149 


3 
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RCT-192 


3 


RCT-196 


3 


RCT-22 


3 


RCT-256 


3 


Ubiquitin D (Ubd) 


3 


RCT-34 


3 


RCT-8 


3 


RCT-89 


3 


Activin receptor type II 


2 


Casein-alpha 


2 


CDK102 


2 


Cellular nucleic acid binding protein (CNBP) 


2 


Complement component C3 


2 


Defender against cell death- 1 


2 


DNA topoisomerase I 


2 


Elongation factor- 1 alpha 


2 


Fatty acyl-CoA oxidase 


2 


Fetuin beta (Fetub) 


2 


Glucose transporter 1 


2 


Glycine methyl transferase 


2 


Histidine-rich glycoprotein 


2 


Hypoxia-inducible factor 1 alpha 


2 


Insuhn-hke growth factor binding protein 3 


2 


Malate dehydrogenase, cytosolic 


2 


N-hydroxy-2-acetylaminofluorene sulfotransferase (ST1C1) 


2 


Organic anion transporter 3 


2 


Organic anion transporting polypeptide 1 


2 


Ornithine aminotransferase 


2 


RCT-127 


2 


RCT-155 


2 


RCT-162 


2 


Calgranulin B4 


2 


Calgranulin B5 


2 


RCT-242 


2 


RCT-244 


2 


RCT-246 


2 


RCT-260 


2 


KC 1 -ZoU 


2 


RCT-291 


2 


RCT-292 


2 


RCT-42 


2 


RCT-84 


2 


RCT-88 


2 


RCT-91 


2 


RCT-92 


2 
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Proteasome activator 28 alpha 


2 


Ribosomal protein L27 


2 


Selenoprotein P 


2 


Senescence marker protein-30 


2 


Stathmin 


2 


Thioredoxin-1 (Trxl) 


2 


Vascular cell adhesion molecule 1 (VCAM-1) 


2 


Vesicular monoamine transporter (VMAT) 


2 


14-3-3 zeta 


1 


Acyl-CoA dehydrogenase, medium chain 


1 


Adrenodoxin reductase 


1 


Alcohol dehydrogenase 1 


1 


Alpha-2-macroglobulin 




Arginosuccinate synthetase 1 


1 


Bcl-2 




Calnexin 


1 


Carbonyl reductase 


1 


Cholesterol esterase 


1 


Cytochrome P450 14DM 


1 


Cytochrome P450 2A3 


1 


Cytochrome P450 2C11 


1 


Cytochrome P450 2C23 


1 


DNA binding protein inhibitor ID2 


1 


eIF-4E 


1 


Equilbrative nitrobenzylthioinosine-sensitive nucleoside transporter 


1 


Fibrinogen gamma chain 


1 


Gamma-glutamyl transpeptidase 


1 


Glucose-6-phosphate dehydrogenase 


1 


Glucose-regulated protein 78 


1 


Heme oxygenase 


1 


HMG CoA reductase 


1 


Iron-responsive element-binding protein 




Low density lipoprotein receptor 


1 


Macrophage inflammatory protein-1 alpha 


V 


Macrophage metalloelastase 


1 


Mitogen activated protein kinase (P38) 


. 1 


Monocyte chemotactic protein receptor (CCR2) 




Mullerian inhibiting substance 


I 


Na/K ATPase alpha-1 




N-cadherin 




Nerve growth factor receptor 




Organic anion transporter Kl 




Organic cation transporter 2 




Peroxisomal multifunctional enzyme type II 




Peroxisome proliferator activated receptor alpha 
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RCT 165 


1 


RCT 252 


1 


RCT- 101 


1 


RCT-111 


1 


Protein O-mannosyltransferase 1 (Pomtl) 


1 


RCT- 129 




A * • •*•'%• ■ ■■■■■■ 

Apoptosis-regulating basic protein 




RCT- 140 


1 


RCT- 147 




RCT-153 




RCT- 164 v 


1 


RCT- 166 


1 


RCT- 18 


1 


RCT-181 


1 


RCT- 185 


1 


RCT-206 


1 


RCT-220 


1 


RCT-221 


1 


Inositol polyphosphate multikinase (lpmk) 




RCT-268 


1 


RCT-276 


1 


RCT-279 




RCT-31 




RCT-36 


1 


RCT-43 


1 


RCT-61 


1 


RCT-72 


1 


RCT-76 


1 


Renal organic anion transporter 


1 


Retinoid X receptor alpha 




Retinol dehydrogenase type III 


1 


Ketinol-binding protein (RBP) 




Sarcoplasmic reticulum calcium ATPase 




Sulfotransferase K2 




Superoxide dismutase Cu/Zn 




T-cell cyclophilin 




Thiol-specific antioxidant (natural killer cell-enhancing factor B) 




Thiopurine methyltransferase 




Thrombin receptor (PAR-1) 





* Combination category is the number of training/test set gene list occurrences. 
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Table 6 Randomly Selected Gene Subsets from 24 H Combo All (216 Genes)* 



Rand 5 (1) 


Rand 5 (2) 


CDK108 


Preproalbumin, sequence 2 (alternate clone 1) 


Ferritin H-chain 


Adrenodoxin reductase 


Histidine-rich glycoprotein 


RCT-111 


RCT-182 


RCT-198 


Inositol polyphosphate multikinase (lpmk) 


RCT-206 



Rand 10 (1) 


Rand 10 (2) 




Cathepsin S 


Bcl-2 


Cellular nucleic acid binding protein (CNBP) 


Cytochrome P450 2A3 


Cholesterol esterase 


Defender against cell death- 1 


DNA binding protein inhibitor ID2 


Ferritin H-chain 


DNA topoisomerase I 


MHC class I antigen RTl.Al(f) alpha-chain 


Iron-responsive element-binding protein 


RCT-221 


RCT-126 


RCT-267 


Apoptosis-regulating basic protein 


RCT-287 


RCT-211 


RCT-49 


RCT-88 


Tissue inhibitor of metalloproteinases-1 






Rand IS (1) 


Rand 15 (2) 


Cellular nucleic acid binding protein (CNBP) 


Glucose transporter 1 


Gamma- glutamyl transpeptidase 


Organic anion transporter Kl 


Glucose transporter 1 


Pancreatic secretory trypsin inhibitor type II (PSTI-II) 


Glucose-regulated protein 78 


RCT-111 


Hypoxia-inducible factor 1 alpha 


RCT-127 


Multidrug resistant protein-3 


RCT-152 


Organic cation transporter 3 


RCT-214 


Peroxisomal multifunctional enzyme type II 


RCT-240 


RCT-126 


RCT-274 


RCT-242 


RCT-279 


RCT-280 


RCT-292 


RCT-287 


RCT-34 


RCT-88 


RCT-8 


Retinol dehydrogenase type ID 


T-cell cyclophilin 


Superoxide dismutase Cu/Zn 


Vesicular monoamine transporter (VMAT) 



* Genes were randomly selected from the Combo All list of predictive genes (216 genes) 
assigning a random number to each gene, sorting by the random number and selecting the 
appropriate number of sorted genes. 
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Table 7 Randomly Selected Gene Subsets from 24 H Combo 6 Gene Set (28 Genes)* 



Rand 5 (1) 


Rand 5 (2) 


Calpactin I heavy chain 


Cathepsin L, sequence 2 


Clusterin 


RCT-152 


Dynein light chain 1 


RCT-271 


RCT-109 


RCT-68 


Ribosomal protein L13A 


Tissue inhibitor of metalloproteinases-1 



Rand 10 (1) 


Rand 10 (2) 


AJpha-tubulin 


Cathepsin L 


Cathepsin L 


PAR interacting protein 


Cathepsin L, sequence 2 


RCT-144 


c-myc 


RCT-198 


Dynein light chain 1 


Vacuole membrane protein 1 


Gaddl53 


RCT-24 


RCT-109 


RCT-241 


RCT-152 


RCT-271 


RCT-198 


Ribosomal protein L13A 


Tissue inhibitor of metalloproteinases-1 


Uncoupling protein 2 



Rand 15 (1) 


Rand 15 (2) 


SOS ribosomal protein L6 (alternate clone 1) 


60S ribosomal protein L6 (alternate clone 1) 


Calpactin I heavy chain 


Cathepsin L 


Cathepsin L 


Cathepsin L, sequence 2 


CDK108 


Dynein light chain 1 


Clusterin 


Gaddl53 


Dynein light chain 1 


Insulin-like growth factor binding protein 1 


Gaddl53 


PAR interacting protein 


Gadd45 


RCT-109 


RCT-109 


RCT-145 


RCT-152 


RCT-152 


Vacuole membrane protein 1 


RCT-198 


RCT-241 


RCT-24 


RCT-68 


RCT-241 


Tissue inhibitor of metalloproteinases-1 


RCT-68 


Uncoupling protein 2 


Tissue inhibitor of metalloproteinases-1 
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* Genes were randomly selected from the Combo All list of predictive genes (216 genes) assigning a 
random number to each gene, sorting by the random number and selecting the appropriate number of sorted 
genes. 
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Table 8 Randomly Selected Gene Subsets from 24 H Combo 5 Gene Set (25 genes)* 



Rand 5(1) 


Rand 5 (2) 


Canalicular multispecific organic anion transporter 


Beta-tubulin, class I 


IgE binding protein 


Heme binding protein 23 


RCT-211 


Osteopontin 


RCT-258 


RCT-211 


Zinc finger protein 


RCT-60 



Rand 10(1) 


Rand 10 (2) 


Beta-actin, sequence 2 


60S ribosomal protein L6 


Beta-tubulin, class I 


Beta-tubulin, class I 


Carbonic anhydrase in, sequence 2 


Carbonic anhydrase IE, sequence 2 


IgE binding protein 


IgE binding protein 


MHC class I antigen RTl.Al(f) alpha-chain 


MHC class I antigen RTl.Al(f) alpha-chain 


RCT-126 


Multidrug resistant protein-3 


RCT-258 


RCT-182 


RCT-50 


RCT-274 


RCT-60 


RCT-50 


Ribosomal protein S9 


Ribosomal protein S9 



Rand 15 (1) 


Rand 15 (2) 


Beta-actin, sequence 2 


Alpha-fibrinogen 


Canalicular multispecific organic anion transporter 


Carbonic anhydrase HI, sequence 2 


Carbonic anhydrase IE, sequence 2 


Heme binding protein 23 


Heme binding protein 23 


IgE binding protein 


IgE binding protein 


Keratinocyte growth factor 


Keratinocyte growth factor 


Multidrug resistant protein-3 


Multidrug resistant protein-3 


RCT-126 


Osteopontin 


RCT-179 


RCT-179 


RCT-182 


RCT-211 


RCT-258 


RCT-258 


RCT-274 


RCT-60 


RCT-49 


Proliferating cell nuclear antigen gene 


RCT-60 


Ribosomal protein S9 


Ribosomal protein S9 


Zinc finger protein 


Thymosin beta-10 



* Genes were randomly selected from the Combo All list of predictive genes (216 genes) assigning a 
random number to each gene, sorting by the random number and selecting the 
appropriate number of sorted genes. 
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Table 9 Randomly Selected Gene Subsets from 24 H Combo 4 Gene Set (23 genes)* 



Rand 5(1) 


Rand 5 (2) 


Hypoxanthine-guanine phosphoribosyltransferase 


Hypoxanthine-guanine phosphoribosyltransferase 


Matrix metalloproteinase-1 


Multidrug resistant protein- 1 


Multidrug resistant protein- 1 


Pancreatic secretory trypsin inhibitor type II (PSTI-II) 


RCT-240 


RCT-38 


RCT-293 


Ref-1 



Rand 10 (1) 

ATP-stimulated glucocorticoid-receptor translocation promoter (Gyk) 

Ceruloplasmin 

Matrix metalloproteinase-1 

RCT-138 , 

RCT-240 

RCT-293 

RCT-38 

Pyruvate kinase, muscle 

Superoxide dismutase Mn 

Ubiquitin conjugating enzyme (RAD 6 homologue) 



Rand 10 (2) 

Organic cation transporter 3 

Preproalbumin, sequence 2 (alternate clone 1) 

ATP-stimulated glucocorticoid-receptor translocation promoter (Gyk) 

Ceruloplasmin 

Hypoxanthine-guanine phosphoribosyltransferase 

Multidrug resistant protein- 1 

RCT-180 

RCT-240 

RCT-287 

Pyruvate kinase, muscle 



Rand 15 (1) 

Preproalbumin, sequence 2 (alternate clone 1) 

ATP-stimulated glucocorticoid-receptor translocation promoter (Gyk) 

CD44 metastasis suppressor gene 

Epidermal growth factor 

Hypoxanthine-guanine phosphoribosyltransferase 

Interleukin-1 beta 
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Matrix metalloproteinase-1 

Multidrug resistant protein- 1 

Organic cation transporter 3 

RCT-180 

RCT-240 

RCT-38 

Pyruvate kinase, muscle 

Superoxide dismutase Mn 

Ubiquitin conjugating enzyme (RAD 6 homologue) 



Rand 15 (2) 

Preproalbumin, sequence 2 (alternate clone 1) 

ATP-stimuIated glucocorticoid-receptor translocation promoter (Gyk) 

CD44 metastasis suppressor gene 

Connexin-32 ______ 

Epidermal growth factor 

Matrix metalloproteinase-1 

Multidrug resistant protein- 1 

Organic cation transporter 3 

Pancreatic secretory trypsin inhibitor type II (PSTI-II) 

RCT-287 

RCT-293 __ 

Pyruvate kinase, muscle 

Ref-1 __ 

Superoxide dismutase Mn 

Ubiquitin conjugating enzyme (RAD 6 homologue) 

* Genes were randomly selected from the Combo All list of predictive genes (216 genes) 
assigning a random number to each gene, sorting by the random number and selecting the 
appropriate number of sorted genes. 
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Table 10 Randomly Selected Gene Subsets from Array Genes Excluding Combo All Set* 



Rand 5 (1) 


Rand 5 (2) 


Argininosuccinate 
lyase 


{ RCT-247} 


RCT-115 


Inter-alpha-inhibitor H4 heavy chain 
(Itih4) 


RCT-37 


RCT-290 


RCT-9 


RCT-96 


Phosphoglycerate 
kinase 


Very long-chain acyl-CoA 
dehydrogenase 





Rand 10 (1) 


Rand 10 (2) 






AT-1 


Aryl sulfotransferase 






Cellular retinoic acid binding protein 
2 


BAK 






Ornithine decarboxylase 


Cyclooxygenase 2 






Peroxisomal 3-ketoacyl-CoA 
thiolase 1 


L-gulono-gamma-lactone oxidase 






RCT-107 


Metallothionein 1 






RCT-117 


Osteoactivin 






RCT-130 


RCT-12 






RCT-134 


Protein kinase C alpha 








Putative membrane fatty acid 






RCT-137 


transporter 






RCT-175 


RAC protein kinase beta 












Rand 15 (1) 


Rand 15 (2) 


Adrenomedullin 


Alpha 1 -antitrypsin 


ATA 


BAK 


Calpain 2 


Bile salt export pump (sister of p- 
glycoprotein) 


Cyclin G 


C4b-binding protein 


Cytochrome P450 17A 


Choline kinase 


Endogenous retroviral sequence, 5* and 3' 
LTR 


Cyclin dependent kinase 2 


NADPH cytochrome P450 oxidoreductase 


Extracellular-signal-regulated kinase 1 


Paraoxonase 1 


Glutathione S-transferase PI 


RCT-102 


Histone 2A 


RCT-143 


RCT-25 


RCT-208 


RCT-57 


RCT-225 


RCT-66 
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RCT-253 


RCT-7 


RCT-52 


RCT-87 


Urate oxidase 


Poly(ADP-ribose) polymerase 



* Genes were randomly selected from the entire array list of genes excluding the Combo 
All 216 predictive genes by assigning a random number to each gene, sorting by the 
random number and selecting the appropriate number of sorted genes. 
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Table 1 1 Kidney Toxicity Individual Sample Prediction Values for 24 Hour Data 
Predictive Genes (Combined List and Subsets) 





Number 






Prediction Measure* 




Gene Set 


of Genes 


Accuracy** 


False Positive** 


False Negative** 


Geometric Mean** 














Combo All 


216 


0.915 (0.861-0.945) 


0.046 (0.012-0.108) 


0.310(0.200-0.467) 


0.810(0.720-0.884) 


Combo 6 


28 


0.921 (0.867-0.955) 


0.062 (0.031-0.108) 


0.300 (0.050-0.533) 


0.837 (0.660-0.953) 


Combo 5 


25 


0.896 (0.829-0.929) 


0.073(0.044-0.122) 


0.269 (0.200-0.467) 


0,821 (0.684-0.870) 


Combo 4 


23 


0.882 (0.829-0.929) 


0.087(0.010-0.145) 


0.325 (0.000-0.467) 


0.776 (0.700-0.925) 


Combo 3 


19 


0.839 (0.778-0.911) 


0.127 (0.054-0.215) 


0,358 (0.133-0.667) 


0.740 (0.562-0.892) 


Combo 2 


45 


0.733 (0.641-0.821) 


0.215(0.113-0.349) 


0.586 (0.400-0.867) 


0.552 (0.343-0.663) 


Combo 1 


76 


0.787 (0.667-0.884) 


0.171 (0.054-0.322) 


0.464 (0.333-0.867) 


0.645 (0.355-0.782) 



* Prediction measures are given as means and range of values (in parentheses) for six 
training/test sets using 24 hour array data and gene lists. Unit of prediction was the 
animal and the predictive classification was for kidney tubular necrosis observed at 72 
hours after treatment. 



** Standard prediction measures were used as defined in Materials and Methods. These 
include: 

Accuracy =Proportion of total number of predictions that are correct 

False positive rate =Proportion of negative cases that are incorrectly classified as 
positive 

False negative rate =Proportion of positive cases that are incorrectly classified as 
negative 

Geometric mean =Performance measure that takes into account proportion of 
positive and negative cases 



90 



WO 03/100030 



PCT/US03/06196 



Tablel2 Kidney Toxicity Compound-Dose Prediction Values for 24 Hour Data 
Predictive Genes (Combined List and Subsets) 



Gene Set 


Number 
of Genes 


Accuracy** 


Prediction 
Measure* 
False Positive** 


False Negative** 


Geometric Mean** 


Combo 
All 


216 


0.932 (0.889- 
0.950) 


0.048 (0.000- 
0.097) 


0.206 (0.000- 
0.500) 


0.859 (0.688- 
0.967) 


Combo 6 


28 


0.950 (0.889- 
1.000) 


0.041 (0.000- 
0.065) 


0.161 (0.000- 
0.400) 


0.894 (0.749- 
0.973) 


Combo 5 


25 


0.945 (0.861- 
1.000) 


0.041 (0.000- 
0.097) 


0.189 (O.00O- 
0.400) 


0.878 (0.736- 
0.984) 


Combo 4 


23 


0.909 (0.889- 
0.950) 


0.059 (0.000- 
0.107) 


0.378 (0.000- 
0.600) 


0.751 (0.622- 
0.945) 


Combo 3 


19 


0.915 (0.892- 
0.974) 


0.067 (0.030- 
0.125) 


0.200 (0.000- 
0.500) 


0.857 (0.688- 
0.985) 


Combo 2 


45 


0.849 (0.757- 
0.892) 


0.105 (0.061- 
0.188) 


0.489(0.167- 

1,000) 


0.608 (0.000- 
0.868) 


Combo 1 


76 


0.847 (0.778- 
0.895) 


0.117 (0.053- 
0.194) 


0.408(0.167- 
0.750) 


0.712 (0.487- 
0.863) 



* Prediction measures are given as means and range of values (in parentheses) for six 
training/test sets using 24 hour array data and gene lists. Unit of prediction was compound-dose 
level and the predictive classification was for kidney tubular necrosis observed at 72 hours after 
treatment. Prediction for compound-dose was based on a majority of individual animal calls. In 
cases where there were an equal number of opposing calls or no calls a no-call was assigned to 
the compound-dose level. 

** Standard prediction measures were used as defined in Materials and Methods. As described 
in Materials and Methods in cases where no prediction was made because the p-value ratio 
exceeded the cutoff-value (generally 0.5) the non-call was considered to be incorrect. 
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Table 13 Kidney Toxicity Compound Prediction Values for 24 Hour Data Predictive 
Genes (Combined List and Subsets) 



Gene Set 


Number 
of Genes 


Accuracy** 


Prediction Measure* 
False Positive** 


False Negative** 


Geometric Mean** 


Combo 
All 


216 


0,944 (0.900- i. 000) 


0.057 (0.000-0.118) 


0.056 (0.000-0.333) 


0.941 (0.797-1.000) 


Combo 6 


28 


0.968 (0.950- 1.000) 


0.037 (0.000-0.059) 


0.000 (0.000-0.000) 


0.981 (0.970-1.000) . 


Combo 5 


25 


0.968 (0.950-1.000 


0.037 (0.000-0.059) 


0.000 (0.000-0.000) 


0.981 (0.970-1.000) 


Combo 4 


23 


0.921 (0.875-0.950) 


0.047(0.000-0.118) 


0.278 (0.000-0.667) 


0.816 (0.563-0.970) 


Combo 3 


19 


0.928 (0.850-0.950) 


0.077 (0.048-0.176) 


0.056 (0.000-0.333) 


0.931 (0.797-0.970) 


Combo 2 


45 


0.881 (O.750-O.950) 


0.086 (0.048-0.235) 


0.333 (0.000-1.000) 


0.706 (0.000-0.970) 


Combo 1 


76 


0.904 (0.850-1.000) 


0.067(0.000-0.118) 


0.278 (0.000-0.667) 


0.810(0.563-1.000) 



* Prediction measures are given as means and range of values (in parentheses) for six 
training/test sets using 24 hour array data and gene lists. Unit of prediction was the compound 
and the predictive classification was for kidney tubular necrosis observed at 72 hours after 
treatment. Compounds were considered toxic if any compound-dose level for that compound 
was predicted as toxic. 

** Standard prediction measures were used as defined in Materials and Methods. As described 
in Materials and Methods in cases where no prediction was made because the p-value ratio 
exceeded the cutoff-value (generally 0.5) the non-call was considered to be incorrect. 
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Table 14 Order of Genes Used for Cumulative Analysis of 
Predictive Performance of Predictive Combo Gene Sets* 



Combo 6 Gene Set 

Gadd45 

Gaddl53 

Clusterin 

Cathepsin L 

PAR interacting protein 

Tissue inhibitor of metalloproteinases-1 
Insulin-like growth factor binding protein 1 

Cathepsin L, sequence 2 

Dynein light chain 1 

RCT-68 

Calpactin I heavy chain 

Alpha-tubulin 

60S ribosomal protein L6 (alternate clone 1) 

Vacuole membrane protein 1 

RCT-241 

RCT-144 

RCT-271 

RCT-24 

RCT-145 

Uncoupling protein 2 

c-myc 

CDK108 

Ribosomal protein S8 

RCT-152 

RCT-158 

Ribosomal protein L13A 

RCT-109 

RCT-198 



Combo 5 Gene Set 

RCT-182 

Carbonic anhydrase III, sequence 2 

RCT-258 

60S ribosomal protein L6 
RCT-274 
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Multidrug resistant protein-3 

Osteopontin 

Beta-actin, sequence 2 

Beta-tubulin, class I 

Zinc finger protein 

Canalicular multispecific organic anion transporter 

Keratinocyte growth factor 

Alpha-fibrinogen 

Ribosomal protein S9 

RCT-60 

RCT-179 

Thymosin beta- 10 

Proliferating cell nuclear antigen gene 

IgE binding protein 

RCT-211 

RCT-49 

RCT-50 

Heme binding protein 23 

MHC class I antigen RTl.Al(f) alpha-chain 
RCT-126 



Combo 4 Gene Set 

Pancreatic secretory trypsin inhibitor type II (PSTI-II) 

RCT-240 

Epidermal growth factor 

Matrix metal loproteinase-1 

RCT-287 

Connexin-32 

ATP-stimuIated glucocorticoid-receptor translocation promoter (Gyk) 

Superoxide dismutase Mn 

Pyruvate kinase, muscle 

Ferritin H-chain 

Multidrug resistant protein- 1 

RCT-293 

Interleukin-1 beta 

Organic cation transporter 3 

Preproalbumin, sequence 2 (alternate clone 1) 

CD44 metastasis suppressor gene 

Ubiquitin conjugating enzyme (RAD 6 hornologue) 
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RCT-38 

Ref-1 

Ceruloplasmin 

Hypoxanthine-guanine phosphoribosyltransferase 

RCT-138 

RCT-180 



* Genes are listed in the order in which they were used for cumulative analysis of 
predictive performance 
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Table 15 Individual Gene Predictions: Combo 6 



i^ene Name 


Overall Correct Calls 






Mean 


3. a. 


min 


may 


Quo riDOSQiTlal pruiein l_D \aliernalt? Clone 
li 


71 .4% 


c 00/ 
D.O/o 


60.2% 


on 4% 




RR ?% 


Q 7% 


CA CO/ 


78.6% 


fialnaptin 1 haaw chain 

\^cii^sciwiii I i i iwavy vi mill 


R9 R% 


ft 9% 


ro. n% 

9U.V /0 


73.2% 


wOU IC7|J0II 1 W 




A R% 


7H Q°/. 


83.3% 


Wdll IO|JwJI 1 I— ( wwa^UwJ IvO 


7fi 7% 




R7 


83.9% 


CDK108 

wL/rv i uo 


CQ JO/ 
OO, f /o 


OC CO/ 


on d% 
tu.H /o 


82.1% 


r*|ijetorin 

WIUOIOI II 1 


7c 00/ 


Iv.O/O 


EC CO/ 

OO.O /o 


84.9% 


p.mwf* 
niyw 


R7 7°/. 


IU. 1 /o 


CC OOA 

OO. £ /o 


80.2% 


DvnAin iinht c ha in 1 


74 fi% 


t.t /o 


RQ A9A 
057. *r /O 


80.2% 


Gadd153 


7n h°a 

fyj.O /o 


10 70/ 


CA 00/ 
0*t. w /O 


87.5% 


Gadd4R 

V_J CL\J U*T w 1 


CQ 00/ 


11 1% 
1 1 . 1 /o 


R1 9% 


91.1% 


Insulin-like arowth factor blndinn nrotein 1 

fl IwUlll 1 1 lr\W Ml W ww III 1 UVlvi I^Jll 1\J !</ 1 WWII 9 1 


C7 40/ 


R 

O.O /o 


R1 Qo/. 


74.6% 


PAR intorartinn nrntpin 


eft 4.0/ 


ft Q% 


CO 00/ 
s?0.0 /O 


75.9% 


RCT-109 


RQ ft% 


0 /o 


R1 9% 
0 1 /o 


72.3% 


RCT-144 


00«*T /0 


OO 00/ 
/o 




91 .3% 


RCT-145 


7fi 9% 


ft Q% 


cc n% 

OJiU /o 


89.7% 


RCT-152 


ca no/ 


op 70/ 


on A% 


81.0% 


RCT-158 


67.6% 


3.9% 


61.3% 


72.2% 


RCT-198 


66.1% 


8.3% 


55.3% 


78.1% 


Vacuole membrane protein 1 


60.8% 


21.3% 


40.0% 


87.5% 


RCT-24 


65.4% 


11.1% 


50.9% 


82.1% 


RCT-241 


79.9% 


6.4% 


73.3% 


92.1% 


RCT-271 


57.2% 


15.2% 


37.3% 


76.8% 


RCT-68 


64.5% 


8.5% 


56.2% 


79.5% 


Ribosomal protein L13A 


55.8% 


15.4% 


27.0% 


71.4% 


Ribosomal protein S8 


58.5% 


18.8% 


20.4% 


70.5% 


Tissue inhibitor of metalloproteinases-1 


74.0% 


11.7% 


56.5% 


87.5% 


Uncoupling protein 2 


73.7% 


4.4% 


67.0% 


78.1% 


Average Individual Combo 6 


67.7% 


11.4% 


51.3% 


80.9% 


Minimum Individual Combo 6 


55.8% 


3.9% 


9.3% 


70.5% 


Maximum Individual Combo 6 


79.9% 


32.8% 


73.3% 


92.1% 
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Table 16 Individual Gene Predictions: Combo 5 





Overall Correct Calls 






Mean 


s.d. 


min 


max 


fiftQ rihfjQr»mal nrntoi n 1 fi 


75.6% 


5.3% 


69.1% 




Alnhn*fiHrinrtrtan 
r^iyi la'DLfiniwy or) 


62.2% 


19.6% 


27.6% 


on ao/ 


Rfitfl-flptin Qoniipnco 9 


65.8% 


23.2% 


19.0% 


ol .U /o 


Rota.ti iHi ilin HacQ 1 


58.4% 


20.5% 


21.9% 


"7 VI CO/ 


f"^Ai"lfllif*i liar mi ilticno/^ifir* nmanin an inn 

vcu laiiLrUfaf niufuopcuJHv* uryanic anion 
transDorter 


59.7% 


6.9% 


52.8% 


oy.i /o 


Carbonic anhydrase III, sequence 2 


CD 9©A 


cO< f /0 


OO Q°A 


81 .8% 


Heme binding protein 23 


cc 00/ 
OO. O /O 


OA QQL 


ft TQ/C 
O. f /o 


76.8% 


IgE binding protein 


fin n% 


00 09L 


1C QO/ 

lo.y /o 


77.3% 


Keratinocyte growth factor 


CO 9% 
uo>^ /o 


fi A% 
u.o /o 




70.6% 


MHC class 1 antigen RT1.A1(f) alpha- 
chain 


cla no/ 


57. 0 /0 


An n% 


66.4% 


Multidrug resistant protein-3 


66.8% 


5.1% 


60.7% 


73.8% 


Osteopontin 


75.3% 


21.0% 


33.3% 


88.4% 


RCT-126 


47.0% 


9.1% 


39.0% 


61.9% 


RCT-179 


67.2% 


10.1% 


56.3% 


85.7% 


RCT-182 


49.9% 


29.4% 


21.3% 


86.6% 


RCT-211 


55.9% 


8.0% 


45.6% 


67.5% 


RCT-258 


72.5% 


15.1% 


42.9% 


82.7% 


RCT-274 


69.8% 


9.0% 


58.3% 


83.3% 


RCT-49 


61.2% 


19.2% 


27.0% 


79.4% 


RCT-50 


58.2% 


16.6% 


25.7% 


72.3% 


RCT-60 


64.7% 


12.1% 


43.7% 


78.6% 


Proliferating cell nuclear antigen gene 


70.5% 


11.9% 


52.4% 


84.9% 


Ribosomal protein S9 


72.9% 


9.4% 


59.0% 


83.9% 


Thymosin beta-10 


67.8% 


7.9% 


59.1% 


82.5% 


Zinc finger protein 


55.0% 


14.7% 


35.2% 


78.6% 


Average Combo 5 


62.7% 


14.5% 


39.7% 


78.0% 


Minimum Individual Combo 6 


47.0% 


5.1% 


8.7% 


61.9% 


Maximum Individual Combo 6 


75.6% 


29.4% 


69.1% 


88.4% 
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Table 17 Kidney Toxicity Individual Sample Prediction Values 
for 24 Hour Data with Random Gene Subsets 



Gene Set 


Random 
Subset* 




Prediction 
Measure** 












Fal^f* Positive*** 


False Negative*** 


Geometric 
Mean*** 


Combo 
All 




0.581 (0.324- 
. 0.778) 


0.416 (0.180- 
0.700) 


0.453 (0.350- 
0.5330 


0.556 (0.374- 
0.658) 


Combo 
All 




0.651 (0.476- 
0.812) 


0.334 (0.155- 
0.578) 


0.489 (0.300- 
0.933) 


0.542 (0.227- 
0.712) 


Combo - 
All 


10 genes (1) 


0.740 (0.593- 
0.875) 


0.239 (0.099- 
0.441) 


0.375 (0.200- 
0.533) 


0.681 (0.591- 
0.842) 


Combo 
All 


10 genes (2) 


0.836 (0.786- 
0.929) 


0.101 (0.031- 
0.172) 


0.278 (0.000- 
0.533) 


0.630 (0.250- 
0.804) 


Combo 
All 


15 genes (1) 


0.823 (0.718- 
0.884) 


0.167 (0.072- 
0.349) 


0.278 (0.000- 
0.533) 


0.763 (0.644- 
0.913) 


Combo 
All 


15 genes (2) 


0.790(0.713- 
0.911) 


0.153 (0.031- 
0.269) 


0.522 (0.400- 
0.650) 


0.633 (0.535- 
0.719) 


Combo 
All 


All 216 genes 


0.915 (0.861- 
0.945) 


0.046 (0.012- 
0.108) 


0.310 (0.200- 
0.467) 


0.810 (0.720- 
0.884) 














Pnmho fi 




0.799(0.713- 
0.845) 


0.177 (0.078-0.33) 


0.317 (0.000- 
0.533) 


0.733 (0.645- 
0.862) 




mJ W i 1 V> O 111 J 


0.757 (0.629- 
0.902) 


0.222 (0.082- 
0.367) 


0.336 (0.133- 
0.550) 


0.713 (0.616- 
0.857) 


Combo 6 


10 genes (1) 


0.893 (0.861- 
0.944) 


0.073(0.031- 
0.118) 


0.300 (0.200- 
0.400) 


0.805 (0.744- 
0.878) 


Combo 6 


10 genes (2) 


0.872 (0.806- 
0.929) 


0.096(0.031- 
0.157) 


0.317 (0.150- 
0.400) 


0.784 (0.740- 
0.847) 


Combo 6 


15 genes (1) 


0.910 (0.886- 
0.955) 


0.043 (0.010- 
0.086) 


0.350 (0.267- 
0.467) 


0.787 (0.710- 
0.852) 


Combo 6 


15 genes (2) 


0.914 (0.883- 
0.964) 


0.050 (0.000- 
0.075) 


0.292 (0.200- 
0.467) 


0.819(0.718- 
0.862) 


Combo 6 


All 28 genes 


0.921 (0.867- 
0.955) 


0.062(0.031- 
0.108) 


0.300 (0.050- 
0.533) 


0.837 (0.660- 
u.y 3 j ) 














Combo 5 


5 genes (1) 


0.704 (0.591- 
0.841) 


0.289 (0.108- 
0.489) 


0.383 (0.050- 
0.533) 


0.646 (0.539- 
0.697) 


Combo 5 


5 genes (2) 


0.797 (0.750- 
0.884) 


0.164 (0.072- 
0.237) 


0.400 (0.200- 
0.600) 


0.702 (0.609- 
0.847) 


Combo 5 


10 genes (1) 


0.805 (0.718- 
0.848) 


0.164 (0.090- 
0.277) 


0.392 (0.250- 
0.733) 


0.702 (0.493- 
0.791) 


Combo 5 


10 genes (2) 


0.864 (0.838- 
0.902) 


0.102 (0.072- 
0.129) 


0.333 (0.200- 
0.467) 


0.772 (0.700- 
0.843) 


Combo 5 


15 genes (1) 


0.900 (0.864- 


0.095 (0.027- 


0.150 (0.000- 


0.874 (0.805- 
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0.937) 


0.167) 


0.333) 


0.914) 


Combo 5 


15 genes (2) 


0.867 (0.829- 
0.929) 


0.104 (0.045- 
0.144) 


0.292(0.200- 
0.467) 


0.794 (0.684- 
0.8420 


Combo 5 


All 25 genes 


0.896 (0.829- 
0.929) 


0.073 (0.044- 
0.122) 


0.269 (0.200- 
0.467) 


0.821 (0.684- 
0.870) 














Combo 4 


5 genes (1) 


0.807 (0.680- 
0.873) 


0.167 (0.082- 
0.325) 


0.361 (0.200- 
0.467) 


0.724(0.686- 
0.777) 


Combo 4 


5 genes (2) 


0.710 (0.648- 
0.764) 


0.290 (0.189- 
0.356) 


0.333 (0.050- 
0.800) 


0.669 (0.403- 
0.801) 


Combo 4 


10 genes (1) 


0.807 (0.705- 
0.884) 


0.138 (0.062- 
0.256) 


0.522 (0.350- 
0.867) 


0.626 (0.350- 
0.751) 


Combo 4 


10 genes (2) 


0.809 (0.741- 
0.839) 


0.166 (0.103- 
0.229) 


0.367 (0.000- 
0.533) 


0.716 (0.605- 
0.878) 


Combo 4 


15 genes (1) 


0.843 (0.800- 
0.911) 


0.122(0.021- 
0.217) 


0.403 (0.000- 
0.600) 


0.706(0.601- 
0.885) 


Combo 4 


15 genes (2) 


0.854 (0.800- 
0.920) 


0.114 (0.021- 
0.181) 


0.356(0.050- 
0.600) 


0.744(0.589- 
0.8820 


Combo 4 


All 23 genes 


0.882(0.829- 
0.929) 


0.087 (0.010- 
0.145) 


0.325 (0.000- 
0.467) 


0.776(0.700- 
0.925) 



* Randomly selected sets of genes derived from the Combo sets. 

* Prediction measures are given as means and range of values (in parentheses) for six 
training/test sets using 24 hour array data and random subsets of genes. Unit of 
prediction was the animal and the predictive classification was for kidney tubular 
necrosis observed at 72 hours after treatment. 



** Standard prediction measures were used as defined in Materials and Methods. As 
described in Materials and Methods in cases where no prediction was made because the 
p-value ratio exceeded the cutoff-value (generally 0.5) the non-call was considered to be 
incorrect. 
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Table 18 Comparison of Predictivity for True Kidney Toxicity Classification and 
Random Classification Using Combo Gene Sets and Random Subsets and 24h data 





Accuracy Accuracy 


Gene List* 


Gene Subset* 


Correct Classification** 




Random Classification** 






Mean 




Min 




Max 






Mean 




Min. 




Max. 


































Combo AH 


All Genes 


0.911 




0.861 




0.945 






0.173 




0 024 




0 304 


} 




5 aenes (\) 


0.581 


( 

1 


0 324 




0.778 






0.265 




0.076 




0 381 






5 aenes (2) 


0.651 


■* 


0.476 




0.813 






0.240 




O 093 




n 4?Q 






10 aenes M) 


0.740 




0 593 




0.875 






0.237 




O 1S7 




0 3B4 






10 aenes (2) 


0*836 


■t 


0 786 




0.929 






0.225 




0 OfiS 

WiVW 




n **04 






1 5 aenes 

l w/ yui leg \ / 


0.823 


( 


0.718 




0.884 






0.252 




0 074 




0 3A4 






1 5 aenes (2.\ 


0*790 




0.713 




0.911 






0.228 




0.102 




0 3Q7 


































Combo 6 


All Genes 


0.921 


J 


0.867 




0.955 






0.203 




O 10? 




0 3Q3 






5 aenes M \ 


0.799 




0.713 




0.845 






0.238 




0.076 




0.429 






5 genes (2) 


0.757 


(1 


0.629 




0.902 






0.223 




0.093 




0.446 






10 aenes 11) 


0.893 


( 


0.861 




0.944 






0.225 




0.037 




0.473 






1 0 aenes (2.} 


0.872 


; 


0.806 




0.929 






0.207 




0.074 




0.473 






15 aenes (1) 


0.910 


T 


0.886 




0.955 






0.224 




0.086 




0.545 






1 5 aenes (2) 


0.914 


T 

, ^ 


0.883 




0.964 






0.229 




0.056 




0.429 


































Combo 5 


All Genes 


0.896 




0.829 




0.929 






0.258 




0.157 




0.348 






5 aenes (1 ) 


0.704 


T 


0.591 




0.841 






0.263 




0.176 




0.357 






5 genes (2) 


0.797 


I 


0.750 




0.884 






0.279 




0.074 




0.446 






10 genes (1) 


0.805 




0.718 




0.848 






0.227 




0.105 




0.381 






1 0 genes (2) 


0.864 


( 


0.838 




0.902 






0.254 




0.046 




0.460 






15 genes (1) 


0.900 




0.864 




0.937 






0.264 




0.148 




0.336 






1 5 genes (2) 


0.867 




0.829 




0.929 






0.223 




0.093 




0.339 


































Combo 4 


Alt Genes 


0.882 


J 


0.829 




0.929 






0.235 




0.074 




0.348 






5 genes (1) 


0.807 




0.680 




0.873 






0.199 




0.130 




0.321 






5 genes (2) 


0.710 


5 


0.648 




0.764 






0.253 




0.165 




0.393 






10 genes (1) 


0.807 




0.705 




0.884 






0.246 




0.1 1 1 




0.393 






1 0 genes (2) 


0.809 




0.741 




0.839 






0.239 




0.139 




0.411 






15 genes (1) 


0*843 




0.800 




0.911 






0.203 




0.056 




0.366 






15 genes (2) 


0.855 




0.800 




0.920 






0.191 




0.037 




0.402 


































Combo 3 


Ail Genes 


0.839 




0.778 




0.911 






0.242 




0.148 




0.295 


































Combo 2 


All Genes 


0.733 




0.641 




0.821 






0.240 




0.056 




0.349 


































Combo 1 


All Genes 


0.787 




0.667 




0.884 






0.220 




0.083 




0.321 


































All-Pred 


5 genes (1) 


0.372 




0.229 




0.500 






0.234 




0.220 




0.242 






5 genes (2) 


0.355 




0.194 




0.518 






0.258 




0.102 




0.429 






10 genes (1) 


0.565 




0.448 




0.661 






0.208 




0.130 




0.268 






10 genes (2) 


0.541 




0.380 




0.696 






0.246 




0.171 




0.375 






1 *? nonoR / 1 \ 


n.sn2 


f 


n?R7 




Ofifil 






0.233 




0 20ft 




0.2fifl 
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* For Combo lists all genes were used or random subsets. All-Pred used genes 

randomly selected from genes that were present on the array but not in the predictive 
list. 

** Accuracy = proportion of the total number of predictions that are correct. Non-calls 
are counted as incorrect predictions. Accuracy was calculated for correct 
classifications of kidney toxicity assigned to the samples and for randomized 
classifications in the same proportions as the correct classifications. Values presented 
are the mean accuracy values for 6 training/test sets with minimum and maximum 
accuracy values. 
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Table 19 Distribution of Compounds* in Individual Training and Test Sets 
for 6 Hour Kidney Data 



Training and Test Set A 



Set A Train 


Set A Train 


Set A Test 


Set A Test 


Negative** 


Positive 


Negative 


Positive 


AMPB 


CAD 


ANIT 


CHCL3 


AZA 


CIS 


5-FU 


CPHOS 


CHLOR 


HYD 


APAP 


GAN 


CLO 


LPS 


BEN 




CYCA 


TET 


BAP 




DEX 




BRB 




DIF 




BUS 




DOX 




CCL4 




ERY 




CAR 




EST 




CLOZ 




ETH 




CMC 




GEN 




CHEX 




MET 




DMN 




PHEN 




ISON 




PUR 




KETO 




TAM 




NAL 




TET 




PBARB 








PEG 








QUIN 








STRZ 





Random Training and Test Set I (Randomly assigned) 



Training Set 1 Negative 


Training Set 1 
Positive 


Test Set 1 Negative 


Test Set 1 Positive 


ANIT 


CAD 


5-FU 


CHCL3 


APAP 


CIS 


AMPB 


CPHOS 


AZA 


GAN 


BRB 


LPS 


BAP 


HYD 


BUS 




BEN 


TET 


CCL4 




CAR 




CHLOR 




CHEX 




CYCA 




CLO 




ERY 




CLOZ 




EST 




CMC 




ETH 




DEX 




ISON 




DIF 




MET 




DMN 




STRZ 
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DOX 








GEN 








KETO 








NAL 








PBARB 








PEG 








PHEN 








PUR 








QUIN 








TAM 









Random Training and Test Set 2 (Randomly assigned) 



Training Set 2 
Negative 


Training Set 2 
Positive 


Test Set 2 
Negative 


Test Set 2 Positive 


APAP 


CHCL3 


5-FU 


CAD 


AZA 


CPHOS 


AMPB 


CIS 


BUS 


HYD 


ANIT 


GAN 


CAR 


LPS 


BAP 




CCL4 


TET 


BEN 




CHLOR 




BRB 




CLO 




CHEX 




CLOZ 




CMC 




DEX 




CYCA 




DOX 




DIP 




EST 




DMN 




ETH 




ERY 




GEN 




ISON 




KETO 








MET 








NAL 








PBARB 








PEG 








PHEN 








PUR 








QUIN 








STRZ 








TAM 









Random Training and Test Set 3 (Randomly assigned) 



Training Set 3 



Training Set 3 



Test Set 3 



Test Set 3 Positive 
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Negative 


Positive 


Negative 




AMPB 


CAD 


5-FU 


HYD 


ANIT 


CHCL3 


APAP 


LPS 


AZA 


CIS 


BAP 


TET 


BEN 


CPHOS 


BRB 




BUS 


GAN 


CAR 




CCL4 




CL07 




CHEX 




DEX 




CHLOR 








CLO 




DMN 




CMC 




FRY 




CYCA 




KFTO 




DOX 




MET 




EST 




PFO 




ETH 








GEN 








ISON 








NAL 








PBARB 








PHEN 








PUR 








QUIN 








STRZ 








TAM 









Random Training and Test Set 4 (Randomly assigned! 



Training Set 4 
Negative 


Training Set 4 
Positive 


Test Set 4 
Negative 


Test Set 4 Positive 


ANIT 


CAD 


5-FU 


as 


APAP 


CHCL3 


AMPB 


CPHOS 


AZA 


GAN 


CAR 


TET 


BAP 


HYD 


CHEX 




BEN 


LPS 


CHLOR 




BRB 




CLO 




BUS 




CMC 




CCL4 




DEX 




CLOZ 




GEN 




CYCA 




ISON 




DIF 




QUIN 




DMN 




STRZ 




DOX 




TAM 




ERY 








EST 
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ETH 








KETO 








MET 








NAL 








PBARB 








PEG 








PHEN 








PUR 









Random Training and Test Set 5 (Randomly assigned) 



Training Set 5 Neg 


Training Set 5 Pos 


Test Set 5 Neg 


Test Set 5 Pos 


5-FU 


CAD 


AMPR 


HYT> 

XI I u 


APAP 


CHCL3 


ANTT 


T PS 


AZA 


CIS 


CCL4 


TET 


BAP 


CPHOS 


CHEX 




BEN 


GAN 


CHLOR 




BRB 




CLO 




BUS 




CLOZ 




CAR 




DIF 




CMC 




DMN 




CYCA 




GEN 




DEX 




ISON 




DOX 




NAL 




ERY 




PHEN 




EST 








ETH 








KETO 








MET 








PBARB 








PEG 








PUR 








QUIN 








STRZ 








TAM 









* For abbreviations please see Table 1 (Compound, Dose, Abbreviation, etc) 
** Negative^ Compounds that did not elicit histopathology (score=l) 

Positive^ Compounds that did elicit histopathology (score of 2 or greater) 
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Table 20 List of Genes, Whose Expression at 6 h Directly Correlates with Kidney 
Tubular Necrosis at 72h, Ranked by Pearson Correlation Coefficient 





Combination 


Gene 


(No. of 




Occurrences) 


Alpha-tubulin 


6 


Calreticulin 


6 


Cathepsin L 


6 


c-H-ras 


6 


Cyclin E 


6 


Gaddl53 


6 


Gadd45 


6 


Glyceraldehyde 3-phosphate dehydrogenase 


6 


ID-1 


6 


Insulin-like growth factor binding protein 1 


6 


Multidrug resistant protein-3 


6 


RCT-111 


6 


RCT-12 


6 


14-3-3 zeta 


5 


ADP-ribosylation factor-like protein ARL184 


5 


Aldehyde dehydrogenase 2 


5 


Beta-tubulin, class I 


5 


Decorin 


5 


Epidermal growth factor 


5 


Gamma-glutamyl transpeptidase 


5 


Heme binding protein 23 


5 


Na/K ATPase alpha- 1 


5 


RCT-103 


5 


RCT-221 


5 


RCT-50 


5 


Pyruvate kinase, muscle 


5 


Ribosornal protein L 1 3 A 


5 


Superoxide dismutase Mn 


5 


Thymosin beta-10 


5 


Tryptophan hydroxylase 


5 


Zinc finger protein 


5 


alpha- 1 ,2-fucosy ltransferase 


4 


Aquaporin-3 (AQP3) 


4 


Cathepsin L, sequence 2 


4 


Endogenous retroviral sequence, 5' and 3' LTR 


4 


Hypoxanthine-guanine phosphoribosyltransferase 


4 


Interferon related developmental regulator IFRD1 




(PC4) 


4 
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Interleukin-1 beta 


4 


Macrophage inflammatory protein -2 alpha 


4 


Peroxisomal 3-ketoacvl-CoA thiol ase 2 


4 


RCT-102 


4 


RCT-109 


4 


RCT-144 

IX X^ X 1 I f 


4 


RCT-24 


4 


Trmcital nnlvnVinQnhflff* miiltikitia^f* Hnmlc^ 


4 


RCT-4Q 


4 


Protpin tvrosine nho<;nhata<;e alnha 

i luiciii ijrivsoiiiw y i iuo pi i ditto w cu^/iia 


4 


Thiol -cnecifif* antioxidant (natural killer cell- 

1 lllwl OUvvlllV utlUvAiUUIII ^IIULUiul Iwl WWII 




enhancing factor B) 


4 


[JncoiiDlinp nrotein 2 


4 


RCT-139 


3 


Bcl-2 


3 


Calpactin I heavy chain 


3 


C"fOS 


3 


Ponnexin-^2 


3 


Cytochrome P450 1 Al * 


3 


Rcto-ATPaQe 

LJwLVs fill Odw 


3 


Heme ox v senate 


3 


Henatoevfe trrowth factor' recen tor 


3 


Inteffrin betal 


3 


N-cadherin 

1 ^ vUUllwl All 


3 


M_hvHroxv-2-acetvlaminofluorene sulfotransfeTase 




(ST1C1) 


3 


Dm i thine decarboxylase 


3 


RCT-147 


3 


RCT-182 


3 


RCT-228 


3 


RCT-240 


3 


RCT-245 


3 


RCT-277 


3 


RCT-43 


3 


RCT-83 


3 


Stathmin 


3 


Alpha- 1 microglobulin/bikunin precursor (Ambp) 


2 


Aspartoacylase 


2 


Colony-stimulating factor- 1 


2 


Equilbrativenitrobenzyithioinosine-sensitive 




nucleoside transporter 


2 


Ferritin H-chain 


2 


Glutathione S-transferase Yb2 subunit 


2 


IgE binding protein 


2 


Macrophage metaJloelastase 


2 
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MaJate dehydrogenase, cytosolic 


2 


Matrix metalloproteinase-1 


2 


MHC class I antigen RTl.Al(f) alpha-chain 


2 


Monoamine oxidase A 


2 


NADPH cytochrome P450 reductase 


2 


RCT-108 


2 


RCT-127 


2 


Apoptosis-regulating basic protein 


2 


RCT-14 


2 


RCT-146 


2 


RCT-151 


2 


RCT-166 


2 


RCT-179 


2 


RCT-180 


2 


Calgranulin B 


2 


RCT-211 


2 


RCT-251 


2 


RCT-274 


2 


RCT-281 


2 


Voltage-dependent anion channel 2 (Vdac2) 


2 


RCT-60 


2 


RCT-76 


2 


RCT-80 


2 


Phosphatidylethanolamine-binding protein 


2 


PTEN/MMAC1 


2 


Sterol carrier protein 2 


2 


Thioredoxin-1 (Trxl) 


2 


Thioredoxin-2 (Trx2) 


2 


Tissue inhibitor of metal loproteinases-1 


2 


Transferrin 


2 


Hemoglobin alpha 1 chain (alternate clone) 




60S ribosomal protein L6 (alternate clone 1) 


! 


Acetylcholine receptor epsilon 


j 


Aldehyde dehydrogenase 1 


j 


Alpha- 1 acid glycoprotein 


! 


AJpha-fibrinogen 


1 


Apolipoprotein CIII 




Argininosuccinate lyase 




ATP-stimulated glucocorticoid-receptor 
translocation promoter (Gyk) 




Calbindin^D (9K) 




Carbamyl phosphate synthetase I 




Caspase 7 




CD44 metastasis suppressor gene 




Cholesterol 7-alpha-hydroxylase (P450 VII) 
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I 


f-TTI V/* 


I 


Pvrlin HpnpnHent kinase 4 


I 


Pvtnrhrnmp f* run dace subunit IV 


I 


Pvtnrhrnmp P4S0 2P1 1 


! 


DMA tnnniQfwnpriiiP T 




Pti/n^irt 1i crht r'Kinin 1 




FOCdi aLUlCMUn isJIlaoC ^Jjp J.i,_iri*VIV J / 




fiamma- af*tin vtfvol n emir 




Hemoglobin alpha 1 chain 


1 






Hypoxia-inducible factor 1 alpha 


1 


T«f»nnAl1nln«< nnlntiitn Kin/linn nrA^Oin (nJl l/Px 1 

intracellular caiciuni-oinciing proiein ^ivuxro^ 




Jagged 1 




Lviajor uasic protein i 




IVlCUiyiaCyi-V-OrV raUClIldiC aipila 


l 


iviuiuurug resisitinL pruicin-z. 




M 0 /U antir\r»r+*»r f A PMH 1 

iNa/xi anuponer ^/^riNrij./ 




iN/VL/* -uepenueni iduciixdic uciijuiugciioovi 
L/ylOSOllC 


1 


nrntpin fPP^ 


1 


V/l 111 IHIUC alJllJLVJU aJlalwl aav 


2 


PAP i nfpr hp tin ff nrrktp in 

l /VLv lllldtll<Ullg piUlGUI 


j 


PprrtYicrkmp QCCPtnVilv "fftPtrtr 




RPT-142 




PPT- 148 








RPT-177 




RPT-194 


j 


RCT-198 




RCT-205 




RCT-214 


j 


RCT-246 


] 


RCT-268 


j 


RCT-28 


1 


RCT-280 




RCT-40 




RCT-53 




RCT-59 




RCT-61 




RCT-64 




RCT-66 




RCT-68 




RCT-74 
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RCT-94 


1 


Phosphogly cerate kinase 




Retinoid X receptor alpha 




Sarcoplasmic reticulum calcium ATPase 




Serotonin transporter (SERT) 




Superoxide dismutase Cu/Zn 




Thymidylate synthase 




Transitional endoplasmic reticulum ATPase 




Very long-chain acyl-CoA synthetase 




VL30 element 
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Table 21 List of Genes, Whose Expression at 6 h Inversely Correlates with 
Kidney Tubular Necrosis at 72h, Ranked by Spearman Correlation Coefficient 



" Gene 


Correlation 
Coefficient 


PPT -770 


-0.35021 


l-vvli CjCioprniin 


-0.35156 


nospnogiyceraie Kinase 


-0 35228 


M AriP-HAr»^nH*»nt i«nrifrfitp He»hvdroffenase cvtosolic 


-0 35233 


3cncsccnwc iiiai.ft.cr piuicm- jv 


-0.35299 


iWs 1*1 UZr 


-0.35413 


rVJ(JltCl*l aLIU gljrtrUpilSldll 


-0.35452 


PPT-A1 


-0.35839 


tiOlCin lyiUMnc pilUopilalaoC dJ.pi la 


-0 36371 


o y ioc riro nic rnju xdi i £*Qt. 


-0.36443 




-0 36576 


PPT 910 


-0.37234 


ppt 

£vv~> 1 - JU 


-0.37235 


PPT OA^ 


-0.37273 


inyiruujiaic oynuiuoc 


-0.37344 


L^yiocnrome c oxidase aUQunu 1 ^ajtciuaic ^iuiic/ 


-0.37359 


IV/Talntp HpVi vHrrtopnacp rvtrmnlic 

LYlalcHC UUIiyUJUgwIKlaCt LjlUOUHU 


-0.37368 


PPT.^fifi 


-0.37504 


PPT-19R 
ivv^ i -1^0 


-0.37762 


RPT-55 

JV*w A JJ 


-0.37788 


Pvtnchrnme P450 2A3 


-0.38623 


RCT-29 


-0.38647 


TransiffiTTnine prowth factor-beta3 


-0.38899 


Vehicular monoairiine transDOrter fVMAT^ 


-0.3894 


Adrenomedullin 


-0.38953 


RCT-28 


-0.39362 


RCT-83 


-0.39619 


RCT-155 


-0.39701 


RCT-98 


-0.39733 


Iron-responsive element-binding protein 


-0.4082 


Mullerian inhibiting substance 


-0.40974 


Inositol polyphosphate multikinase (Ipmk) 


-0.41105 


Sarcoplasmic reticulum calcium ATPase 


-0.41428 


Na/H antiporter (APNH1) 


-0.41496 


Maspin 


-0.41712 


Osteoactivin 


-0.42233 


Empty 


-0.4236 


Cytochrome P450 1A1 


-0.42401 


RCT-246 


-0.42616 
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Protein kinase C alpha 


-0.43582 


Cyclin Dl 


-0.43742 


Caveolin-3 


-0.44097 


RCT-3 


-0.44517 


RCT-69 


-0.4463 


RCT-80 


-0.44717 


RCT-194 


-0.4479 


Carbamyl phosphate synthetase I 


-0.44845 


RCT-119 


-0.45514 


Selenoorotein P 


-0.45557 


RCT-112 


-0.46143 


Heoatocvte nuclear factor 4 


-0.46336 


Macroohaee metal loel as tase 


-0.46368 


RCT-74 


-0.46524 


Decorin 


-0.46894 


RCT-139 


-0.47817 


Very long-chain acyl-CoA synthetase 


-0.48218 


Hepatocyte growth factor receptor 


-0.48369 


RCT-270 


-0.48392 


RCT-182 


-0.48529 


Histone 2A 


-0.51079 


Phospholipase D 


-0.51088 


Fatty acyl-CoA oxidase 


-0.5219 


RCT-268 


-0.52288 


Gamma-actin, cytoplasmic 


-0.54554 


Aquaporin-3 (AQP3) 


-0.5821 


Epidermal growth factor 


-0.62877 
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Table 22 List of genes whose expression at 6 hours is predictive of kidney toxicity at 72 
hours 



Gene 


Combination (No. 
of Occurrences) 


Alpha-tubulin 


6 


Calreticulin 


6 


Cathepsin L 


6 


c-H-ras 


6 


Cyclin E 


6 


Gaddl53 


6 


Gadd45 


6 


Glyceraldehyde 3-phosphate dehydrogenase 


6 


ID-1 


6 


Insulin-like erowth factor binding orotein 1 


6 


Multidrug resistant orotein-3 


6 


RCT-111 


6 


RCT-12 


6 


14-3-3 zeta 


5 


ADP-ri bos v 1 ati on f actor- li ke nrotei n ARL 184 


5 


Aldehvde dehvdroeenase 2 


5 


Beta-tubul in /class I 


5 


Decorin 


5 


Ed i dermal erowth factor 


5 


Gamma- glutamyl transpeptidase 


5 


Heme binding protein 23 


5 


Na/K ATPase alpha- 1 


5 


RCT-103 


5 


RCT-221 


5 


RCT-50 


5 


Pyruvate kinase, muscle 


5 


Ribosomal protein L13A 


5 


Superoxide dismutase Mn 


5 


Thymosin beta- 10 


5 


Tryptophan hydroxylase 


5 


Zinc finger protein 


5 


alpha- 1 ,2-fucosy ltransferase 


4 


Aquaporin-3 (AQP3) 


4 


Cathepsin L, sequence 2 


4 


Endogenous retroviral sequence, 5' and 3 ? LTR 


4 


Hypoxanthine-guanine 
phosphoribosyltransferase 


4 


Interferon related developmental regulator 
IFRD1 (PC4) 


4 


Interleukin-1 beta 


4 
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Macrophage inflammatory protein-2 alpha 


4 


Peroxisomal 3-ketoacvl-CoA thiol ase 2 


4 


RCT-102 


4 


RCT-109 


4 


RCT-144 


4 


RCT-24 


4 


Tnncitnl nnlvnhfKnhatp mill ti kinase Momkl 


4 


RCT-49 


4 


Prnfpin tvmQinp nhn^nhataif* alnha 


4 


Thinl-QTv»rifif* anHnxifiant ^natural killer cell- 

i OL/VvlllV wl 1 LI VAlUCllll ^HulUl (U lUliVl wwli 

prihaneirit* factor 


4 


I Tneonnlint* nrntein 2 


4 


RCT-139 


3 




3 


Calpactin I heavy chain 


3 


L"iu» 


3 


f^OTmpx in-T? 

l^UllllwAill 'J ** 


3 


Pvtnrhrnmp P4^0 1 A1 


3 


Rrtn-ATPa^p 


3 


Wattia ayuophacp 


3 


HAr*atT\r»vtA crrnix/th f nr*tf*r fpr(*ntnr 


3 


ii uc grin UClal 


3 


\T_r» a H Wpri n 
l^i-waUIlClill 


3 


M_h vHmv v-9 -arptv! atninofl iinrene. 

1^1 "11 YU1 UA Y X "ClWCLJr KUIIIII V/llUV/i &I1W 

^ulfotransferase fSTlCH 


3 


Ornitriirip flpcarhoxvlase 


3 


RCT-147 


3 


PPT.1 R7 

I\.V^ 1*1 ox 


3 


RrT-928 


3 


RCT-240 


3 


RCT-245 


3 


RCT-277 


3 


RCT-43 

IVv * "J 


3 


RCT-83 

lx.\_, 1 "O— ' 


3 


Stathmin 


3 


Alpha- 1 microglobulin/bikunin precursor 
(Ambp) 


2 


Aspartoacylase 


2 


Colony-stimulating factor-1 


2 


Equilbrative nitrobenzylthioinosine-sensitive 
nucleoside transporter 


2 


Ferritin H-chain 


2 


Glutathione S-transferase Yb2 subunit 


2 


IgE binding protein 


2 


Macrophage metalloelastase 


2 
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Mai ate dehydrogenase, cvtosolic 


2 


Matrix metal loDroteinase-1 


2 


MHC class I antieen RT1 Al(f) alDha-chain 


2 


Monoamine oxidase A 


2 


NADPH r vtnchrnme P450 reductase 


2 


RCT-108 


2 


RPT-127 


2 


AnnntnciQ-rppiilatinp ha^ie nrotein 


2 


RfT-14 


2 




2 


RfT-1 SI 

£Vv*< 1 1 -/ 1 


2 




2 


RCT-17Q 

E\.Vo 1 1 r 7 


2 


RfT-1 80 


2 


f^fllfftamilin T\ 
v^aigicuiujiti i-# 


2 


IV. V— 1 "X 1 1 


2 


RCT-25 1 

1\V_ 1 XJ 1 


2 


RCT-274 


2 




2 




2 


IV 1 "vV 


2 


ivv_, i - / u 


2 


RCT-80 
rvv_y i "ou 


2 


r IlUspjlaUUyivUlallUlaniinC'UlIlUlIlg prULClIl 


7 


PTFN/MMAC 1 

* J ClNr IVlJVlrVV^ 1 


2 


VstArrtl r*sirripr rrntpin 7 
JkClUl Calllvl Lyl\MvIU 


2 


Th i nredox i n- 1 fTrx 1 ^ 


2 


Thioredoxin-2 CTtxI} 


2 


Tiqqup inhibitor of mptallor*mtpina<ie<i-1 


2 


Transferrin 


2 


PTemoolohtn alnha 1 chain ^alternate cloned 




60S riho^omal nrotp-in T f\ f alternate clone 1 ^ 




Acetylcholine recentor eosilon 




Aldehvde dehvdro&enflse 1 




Alnha-1 arid (xlvconrotein 




AJpha-fibrinogen 




Apolipoprotein CUT 




Argininosuccinate lyase 




ATP-stimulated glucocorticoid-receptor 
translocation promoter (Gyk) 




Calbindin-D (9K) 




Carbamyl phosphate synthetase I 




Caspase 7 




CD44 metastasis suppressor gene 




Cholesterol 7-alpha-hydroxylase (P450 VII) 
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c-iun 


I 


C~1T1YC 




Pvclin denendent kinase 4 


j 


Cytochrome c oxidase subunit IV 




Pvtochrome P450 2C1 1 




V)f>J A tonnionrneraQe I 




Dvnein litxht chain 1 






2 " 


("ramma-a^Hr* rvtonla^mic 

VJCU1111ICI UWllllf V* ¥ LVFl/lCUllliV* 


2 


Hemoglobin alpha 1 chain 




ITlcpalU^jrlC UUwJCiir lavLUI *t 




Hypoxia-inducible factor 1 alpha 




UllTaCCllUldr WolL-lUIIl-OlIlUIllg piVHwJl \LVU\rOJ 





Jagfi e a * 




iviajor odsic proiein i 


r 









iviuiuurug rCalalalll piuiciu-z 




KTa/tf a«t4nnrt*»T* Z' A PMT4 1 t 

LNa/n anupuner ^/vriNriij 


i 


p(/\jL^r -acpcuocni jsuviuaic acnyurogenaoc, 
cytosoiic 


1 


LNur~inuutiuic aiiii-pruiiiciauvc puiduvc 

cp/^fpfpH fvrritpin fPt 1 
sCwlvtCU piULClll ^ivJ; 


1 


Dmithinp nminntTari<ifpra<ie 




PAR intPTflctino orotein 

i /^JCV luidciwuiig |JiWiwiii 




PprnYicnmp scqptyiHIv factor 0 

rClUMS>UJ 11C <U!>dllL/ljr Id^LUl a> 




opT. 14.7 


l 


RCT-148 




RCT-153 




RCT-177 


— j 


RCT-194 




RCT-198 




RCT-205 




RCT-214 





RCT-246 


2 ■" 1 


RCT-268 


2 


RCT-28 


2 


RCT-280 




RCT-40 




RCT-53 




RCT-59 




RCT-61 




RCT-64 




RCT-66 




RCT-68 




RCT-74 
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RCT-94 


1 


Phosphoglycerate kinase 




Retinoid X receptor alpha 




Sarcoplasmic reticulum calcium ATPase 




Serotonin transporter (SERT) 




Superoxide dismutase Cu/Zn 




Thymidylate synthase 




Transitional endoplasmic reticulum ATPase 




Very long-chain acyl-CoA synthetase 




VL30 element 





* Combination category is the number of training/test set gene list occurrences. 
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Table 23 Kidney Toxicity Compound-Dose Prediction Values for 6 Hour Data Predictive 
Genes (Combined List and Subsets) 



Gene Set 


Number 
of Genes 


Accuracy** 


Predictio 
False Positive** 


n Measure* 
False Negative** 


Geometric Mean** 


Combo All 


176 


0.719(0.571-0.793) 


0.258 (0.12-0.45) 


0.442 (0.167-0.8) 


0.61 (0.42-0.75) 


Combo 6 


15 


0.747 (0.567-0.8) 


0.217 (0.08-0.48) 


0.489 (0.167-1.0) 


0.542 (0-0.8) 


Combo 5 


16 


0.536 (0.33-0.7) 


0.473 (0.2-0.76) 


0.469 ( 0.2-0,8) 


0.48 (0.4-0.65) 


Combo 4 


19 


0.731 (0.607-0.875) 


0.224 (0.05-0.4) 


0.525 (0.2-0.8) 


0.5B4 (0.4-0.74) 


Combo 3 


21 


0.635 (0.33-0.83) 


0.348 (0.04-0.68) 


0.514(0.17-0.8) 


0.514(0.35-0.63) 


Combo 2 


38 


0.607 (0.35-0.83) 


0.35B (0.04-0.68) 


0.63 (0.4-1.0) 


0.402 (0-0.6) 


Combo 1 


67 


0.588 (0.42-0.82) 


0.406 (0.11-0.64) 


0.497 (0.2-0.8) 


0.509 (0.39-0.63) 
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Table 24 Distribution of Compounds* in Individual Training 
and Test Sets for 72 Hour Kidney Data 



Training and Test Set A 



Training Set A 
Negative** 


Training Set A 
Positive 


Test Set A 
Negative 


Test Set A 
Positive 


AMPB 


CIS 


ANIT 


CHCL3 


AZA 


HYD 


5-FU 


CPHOS 


CAD 


LPS 


APAP 


GAN 


CHLOR 


TET 


BEN 




CLO 




BAP 




CYCA 




BRB 




DEX 




BUS 




DIF 




CCL4 




DOX 




CAR 




ERY 




CLOZ 




EST 




CMC 




ETH 




CHEX 




GEN 




DMN 




MET 




ISON 




PHEN 




KETO 




PUR 




NAL 




TAM 




PBARB 




TET 




PEG 








OUIN 








STRZ 








THEO 





Training and Test Set 1 



Training Set 1 Negative 


Training Set 1 
Positive 


Test Set 1 Negative 


Test Set 1 Positive 


AMPB 


CPHOS 


5-FU 


CHCL3 


ANIT 


GAN 


APAP 


as 


AZA 


LPS 


BEN 


HYD 


BAP 


TET 


BRB 




CAD 




BUS 




CAR 




CLOZ 




CCL4 




CMC 




CHEX 




DIF 




CHLOR 




DMN 




CLO 




DOX 




CYCA 




ERY 




DEX 




ETH 
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EST 




NAL 




GEN 




PEG 




ISON 




PUR 




KETO 




STRZ 




MET 




TAM 




PBARB 








PHEN 








QUIN 








THEO 









Training and Test Set 2 



Training Set 2 
Negative 


Trainine Set 2 
Positive 


Test Set 2 
Negative 


Test Set 2 Positive 


AMPB 


CHCL3 


5-FU 


CPHOS 


APAP 


CIS 


ANIT 


LPS 


AZA 


GAN 


BRB 


TET 


BAP 


HYD 


CAD 




BEN 




CHEX 




BUS 




CHLOR 




CAR 




CLOZ 




ecu 




CMC 




CLO 




DEX 




CYCA 




DMN 




DIF 




GEN 




DOX 




NAL 




ERY 




PUR 




EST 




QUIN 




ETH 




STRZ 




ISON 




TAM 




KETO 




THEO 




MET 








PBARB 








PEG 








PHEN 









Training and Test Set 3 



Training Set 3 
Negative 


Training Set 3 
Positive 


Test Set 3 
Negative 


Test Set 3 Positive 


ANIT 


CHCL3 


5-FU 


CPHOS 


APAP 


CIS 


AMPB 


LPS 


BEN 


GAN 


AZA 


TET 


BUS 


HYD 


BAP 




CAD 




BRB 
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CAR 




CCL4 




phi OR 




CHEX 




ct o 




CYCA 








DIF 




CMC 




DOX 








ERY 




DMN 




GEN 




EST 




ISON 




ETH 




PBARB 




KETO 




PHEN 




MET 




PUR 




NAL 




STRZ 




PEG 








QUIN 








TAM 








THEO 









Training and Test Set 4 



Training Set 4 
Negative 


Training Set 4 
Positive 


Test Set 4 
Negative 


Test Set 4 Positive 


5-FU 


CHCL3 


AMPB 


CPHOS 


APAP 


CIS 


ANTT 


HYD 


BEN 


GAN 


AZA 


LPS 


CAR 


TET 


BAP 




CHEX 




BRB 




CHLdR 




BUS 




CLO 




CAD 




CLOZ 




CCL4 




CMC 




DEX 




CYCA 




ERY 




DIF 




EST 




DMN 




ETH 




DOX 




KETO 




GEN 




PBARB 




ISON 




QUIN 




MET 




TAM 




NAL 




THEO 




PEG 








PHEN 








PUR 








STRZ 
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Training and Test Set 5 



Training Set 5 
Negative 


Training Set 5 
Positive 


Test Set 5 
Negative 


Test Set 5 Positive 


AZA 


CPHOS 


5-FU 


CHCL3 


BAP 


GAN 


AMPB 


CIS 


BRB 


HYD 


ANIT 


TET 


BUS 


LPS 


APAP 




CAR 




BEN 




CHEX 




CAD 




CHLOR 




CCL4 




CLO 




CMC 




CLOZ 




DEX 




CYCA 




ERY 




DIF 




EST 




DMN 




ETH 




DOX 




GEN 




KETO 




ISON 




NAL 




MET 




PBARB 




QUIN 




PEG 




THEO 




PHEN 








PUR 








STRZ 








TAM 









* For abbreviations please see Table 1 (Compound, Dose, Abbreviation, etc.) 
** Negative= Compounds that did not elicit histopathology (score=l) 

Positive= Compounds that did elicit histopathology (score of 2 or greater) 
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Table 25 List of Genes, Whose Expression at 72 h Directly Correlates with 
Kidney Tubular Necrosis at 72h, Ranked by Pearson Correlation Coefficient 



Gene 


Correlation 
Coefficient 


Clusterin 


0.6981305 


RCT-274 


0.665856 


Gaddl53 


0.6007961 


Multidrug resistant protein- 1 


0.5731272 


Alpha-tubulin 


0,5714773 


Dynein light chain 1 


0.5593824 


Multidrug resistant protein-3 


0.5498183 


Beta-tubulin, class I 


0.5419734 


Tissue inhibitor of metalloproteinases-1 


0.5197937 


CD44 metastasis suppressor gene 


0.511474 


Thymosin beta- 10 


0.5042843 


Calpactin I heavy chain 


0.4974941 


Alpha-fibrinogen 


0.4904063 


RCT-207 


0.4767162 


RCT-127 


0.4754919 


Uncoupling protein 2 


0.461348 


Beta-actin, sequence 2 


0.4559092 


MHC class I antigen RTLAl(f) alpha- 
chain 


0 4462703 


IgE binding protein 


0.444906 


Ceruloplasmin 


0.4436448 


c-myc 


0.442725 


RCT-24 


0.4374066 


rnsulin-like growth factor binding protein 
1 


0.4345538 


RCT-50 


0.4314294 


Cyclin G 


0.4260349 


RCT-12 


0.419707 


RCT-59 


0.4164921 


Zinc finger protein 


0.4164407 


Alpha-1 microglobulin/bikunin precursor 
(Ambp) 


0.4004037 


Complement component C3 


0.3995206 


RCT-49 


0.3986999 


Liver fatty acid binding protein 


0.3981068 


Monocyte chemotactic protein receptor 
(CCR2) 


0.3974403 


RCT-240 


0.3924163 


RCT-126 


0.3918833 
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RCT-241 


0.3856618 




0.3792303 




0 3748433 


cmeriii 


0 3741759 


LFiyccraJucuyuc J*pilOapilalC 


0.3684493 


V/oc/*ii1sit* 1 qsiHpci rtn tyi r\] pr* 1 1 1 p 1 
k V V^rVLVA A ) 


0.3678838 


RiKncnmnl nrnt^in I IT A 


0 3664144 


Hypoxanthine-guanine 
jhosphoribosyltransferase 


0.3659874 


Suppressor of cytokine signaling 3 


0.3630873 


Activating transcription factor 3 


0.3625623 


Major acute phase protein alpha-1 


0.3620322 


Major basic protein 1 


0.3614528 


RCT-258 


0.3607649 


RCT-293 


0.3592598 


RCT-138 


0.3578431 


Alanine aminotransferase 


0.3506821 
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Table 26 List of Genes, Whose Expression at 72 h Inversely Correlates with 
Kidney Tubular Necrosis at 72h, Ranked by Spearman Correlation Coefficient 



Gene 


Correlation 
Coefficient 


RCT-42 


-0.25083 


Membrane bound cytochrome b5 


-0.25275 


RCT-132 


-0.25352 


RCT-99 


-0.25374 


Four repeat ion channel 


-0.25412 


RCT-62 


-0.25524 


RCT-137 


-0.25548 


AT-1 


-0.25881 


UDP-glucuronosyltransferase 2B 


-0.26029 


RCT-214 


-0.26618 


vlethylacyl-CoA racemase alpha 


-0.26791 


CyclinDl 


-0.27006 


Organic anion -transporting polypeptide 1 


-0.27038 


Cystatin C 


-0.27304 


Matrin F/G 


-0.27305 


RCT-181 


-0.27455 


RCT-25 


-0.27625 


RCT-143 


-0.27626 


RCT-93 


-0.28389 


Protein tyrosine phosphatase alpha 


-0.28421 


RCT-79 


-0.28485 


Caspase 2 


-0.28686 


Vascular endothelial growth factor 


-0.28716 


Glutathione S-transferase Ya 


-0.28785 


Senescence marker protein-30 


-0.29192 


RCT-178 


-0.29272 


Organic anion transporter Kl 


-0.29329 


RCT-256 


-0.2943 


25-DX 


-0.29444 


RCT-22 


-0.29564 


Sarcoplasmic reticulum calcium ATPase 


-0.2974 


RCT-280 


-0.29749 


RCT-148 


-0.30758 


Arginosuccinate synthetase 1 


-0.30894 


RCT-142 


-0.31028 


RCT-260 


-0.31039 


Apoptosis-regulating basic protein 


-0.31798 


Organic anion transporter 3 


-0.32302 
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urniininc aminuu ai laicja&c 


-0 32748 


rXCITlU El OUl n alLIIlci 1 wiicuii ^cULwiliaiv 


-0 33449 


^yiocnrome r*fjw x,t\o 




nemogiODin aipna i cnain 




Selenoprotein P 


-ft 


L-yiocnrome r**ou z^z j 


-ft IdfiQfi 


rancreatic secretory trypsin inniDiior lype u vro 11-11; 


ft 14717 

-U.JH-/ 1Z 


KC 1 -Jo 


ft 14QR7 

-u.j*tyoz 


Iron -responsive element-binding protein 


-ft 1^77 
•U.JJ / z 


dot in 


ft 1^778 
-U.jOZ/o 


Epidermal growth factor 


ft 1A4R7 


Sodium/glucose cotransporter 1 


ft lA^QA 


D/*"T Oil 

KCi-212 


-U.jOOU** 


Cytochrome c oxidase subunit II 


_ft lf\£7R 


npr On 


_ft 17ftlfi 


Acyl-CoA dehydrogenase, medium chain 


-U.J /JZO 


T>/" ,r P ICS 


_ft 177Q1 




-ft 17007 


Mai ate dehydrogenase, cytosolic 


-ft iR70Vi 


D-dopachrome tautomerase 


ft 1RA07 
-U, JO*fy / 


1 -o / 


ft 1RS7 
-U. Jo J / 


rancreatic secretory trypsin mnioitor lype 11 ^ro 11-117 ^alternate cionej 


-ft AC\(\r\A 
-U.*fUUU*r 




-0.40144 


RCT-69 


-0.40543 


Thiopurine methyltransferase 


-0,41035 


Very long-chain acyl-CoA synthetase 


-0.41248 


Fatty acyl-CoA oxidase 


-0,42391 


RCT-287 


-0.4351 


Dimethylarginine dimethylaminohydrolase 


-0.4413 


RCT-182 


-0.44238 


RCT-291 


-0.4606 


3-hydroxyisobutyrate dehydrogenase 


-0.48712 
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Table 27 List of genes whose expression at 72 hours is 
predictive of kidney toxicity at 72 hours 





Combinations 


Gene 


(No of 




Occurrences) 


Alanine aminotransferase 


6 


AJoha-tubulin 


6 


Beta- act in. seauence 2 


6 


Beta-tubulin. class I 


6 


Gaddl53 


6 


Glyceraldehyde 3-phosphate dehydrogenase 


6 


Insulin- like growth factor bin dine orotein 1 


6 


In te grin beta-4 


6 


Mai or basic nrotein 1 


6 


MHC class I antieen RT LA 1(f) aloha-chairi 


6 


Monocyte chemo tactic nrotein receotor (CCR2) 


6 


Multidrug resistant protein-3 


6 


RCT-211 


6 


RCT-24 


6 


RCT-240 


6 


RCT-274 


6 


Alpha-fibrinogen 


5 


Calpactin I heavy chain 


5 


CD44 metastasis suppressor gene 


5 


Ceruloplasmin 


5 


c-myc 


5 


Dynein light chain 1 


5 


Emerin 


5 


Hypoxanthine-guanine phosphoribosyltransferase 


5 


IgE binding protein 


5 


Liver fatty acid binding protein 


5 


Major acute phase protein alpha- 1 


5 


Multidrug resistant protein- 1 


5 


RCT-12 


5 


RCT-127 


5 


RCT-182 


5 


RCT-293 


5 


RCT-49 


5 


RCT-50 


5 


RCT-59 


5 


Ribosomal protein L13A 


5 


Suppressor of cytokine signaling 3 


5 


rhymosin beta- 10 


5 



/ 
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Tieeina inKiifiitnr nf rnptAl1oni*Ot£ina«ies- 1 


5 


uncoupling kivigjii ^ 


5 


Panni-pcihV cprrpfnrv trvrKin inhibitor tvoe 11 fPSTI- 


4 


l_l ) ^aiiCJllalc l-hjiig^ 






4 


Activating transcription factor 3 


4 


A 1 rill s> 1 m \ f rr\ rt 1 1 1 i n /n 1 h 1 1 T1 1 T1 TYTP/MlfCrtF f AlTlnn 1 

AJpna-1 rnicrogiouuiin/uiivuriin prccurbur v/vuiup^ 


4 


P1*» + ft 

ui usten n 


4 


uompjemeni component d 


4 


uycun aepenueni Kinase 4- 


4 


?atty acyi-L-OA oxidase 


4 


Lraao/o 


4 


IN a/ is. A X rase aipna- 1 


/I 


[no ten 1 


4 


rancreauc secretory xrypsin inniDiior type n yr o 11- 


4 






KV^ I - 1 ZO 


4 


KV^I-l jo 


4 


ppr 907 
I\.V_ 1 1 


4 




4 


ppt ofn 

ls\s 1 -ZO / 


4 


PPT Aft 


4 


Qtofrtimirt 

aiainiiun 


4 


Qiinarnvi/iA Hi cm 1 1 fil c*» \Afl 
ijUpcrUAlUc UlalllUlooC lviil 


4 


1 'KfnTnVsrvmoH 1 1 1 i n 
X lUUlllUUlliVUUlJll 


4 


VoSCUIaT Ceil aUilCalUll II1U1CCU1C 1 v. » V^/AIVI 1 / 


4 


Lane linger piuivin 


4 


0 C k\f/1rAYvvitiimin T*^.1 ftlnfisi-KivHrnvvln^P 

^j-nyoruAy viuuiTiii uj i cupuaMiyuruAyjaav 


3 


'X.hvrlrfvY vionbiitvratp riphvflropenfl^e 


3 


^-mpthvlflrienine DNA fflvcosvlase 


3 


AnnPYin V 


3 


Rax falnha^ 


3 


f^arHnnvl rpdiicta^e 

V^oLi U\Jli J 1 IwUUvUlav 


3 




3 


c-jun 


3 


Cyclin G 


3 


Cytochrome P450 2C23 


3 


D-dopachrome tautomerase 


3 


Dimethylarginine dimethylaminohydrolase 


3 


DNA binding protein inhibitor ID2 


3 


Ecto-ATPase 


3 


Epidermal growth factor 


3 


Interleukiri-10 


3 


Macrophage inflammatory protein-2 alpha 


3 


NADPH cytochrome P450 oxidoreductase 


3 
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RCT-10I 


3 


RCT-109 


3 


RCT-146 


3 


RCT-155 


3 


RCT-192 


3 


RCT-212 


3 


RCT-258 


3 


RCT-287 


3 


RCT-291 


3 


RCT-296 


3 


RCT-87 


3 


Pr^nrnal hn m i n 
r i c^si vcul/uiiuii 


3 


RAD 


3 


Thioredoxin-2 (TrxT^ 


3 


Verv 1 one-chain acvl-CoA synthetase 


3 


25-DX 


2 


Activin recentor tvne IT 

rlvU Till V vv 1*/ L Vp/ J> 


2 


Aldphvde dehydrogenase microsomal 


2 


Pathensin I,, sentience 2 


2 


Pafrhp'nsiii S 


2 


CCR-5 


2 


CXCR4 


2 


villi i/i 


2 


Pvclin dependent kinase 2 


2 


Cvclooxveenase 2 


2 


Cytochrome c oxidase sub unit II 


2 


Cvtochrome P450 1A1 


2 


Diacvlplvcerol kinase zeta 


2 


E-selectin 


2 


GIucose-6-phosphate dehydrogenase 


2 


ED-1 


2 


Mai ate dehydrogenase, cytosolic 


2 


Monoamine oxidase B 


2 


Myelin basic protein 


2 


Organic anion transporter Kl 


2 


RCT-10 


2 


RCT-141 


2 


RCT-145 


2 


RCT-215 


2 


RCT-237 


2 


RCT-271 


2 


RCT-34 


2 


RCT-39 


2 


RCT-6 


2 


RCT-66 


2 



129 



WO 03/100030 



PCTYUS03/06196 





2 




2 


RPT-Q9 


2 


Phr*cnh ntiH vrlpthanol ominp-hinni n o nrotpin 
r iiUopilaliuyiGUiaiiuicuiiiiiG-ijiiiuiiig pi v/iwin 


2 


Pfrteto <rl anni n W cv/ntnucp 


2 


ocicnupruLcin sr 


2 


senescence manter proieinou 


? 


souiunv glucose cotraiiaponcr I 


7 


i nioi-speciiic anuOAiaani yu aiux <u xiiier ccu- 
cnnancing lacior d j 


2 


1 UlUpUXlilC niwUiyilXallolviadC 


2 


i issue iacior 




TJPT.171 
1 - 1 / 1 




nemo gio uin dipna i viiuin ^diicriiaic duiioy 


: 


j-I^La-IiyulUA jolCIAJlU UCIiy UlUgCIldaC \Ll\jLs J lJ 1 ) 


: 


Uw UUvaVJlllcll pi Vic ill 


~ x 


Apwl_r , r»A HphvHmcrpnacp mpHinm ftisiin 
r\L<yi~V^-Ur\ UGJiyUlUgCIlaaC, 1I1CUJUI1J dial 11 


: 


A T^P_»*i V^nc\/1 ♦afrirxn fonfor— lilrp nrntpin ART 1 
rVL/r -riUUayiallUU IuvlUl~llKC piULCUl nJ\JjlO*t 


j 




A1s4ph\/s1p H^liVrrl-r/icrpnacP 7 




rvryi nyurotai uun rcvcpu/i 









Pol/*inaimn U 

L^aicincunn-i? 




^•aircuvuiin 




f~^QT*Kr\r* t /•* anhvHracp TTT cpfiiiPnfA 7 

i^aruoiiiv dnnyurttoc ii.i t adjucucc 





^dUlcpMll J-* 





Cellular nucicic aciu uinuiiig piuicin ^uior^ 




c-H-ras 


j 


[^/"\-r»ti avin Ay 

k-unncAin-jx. 


7 


k_y aulLlll 


7 






Pvtochrame P450 2B 1 /2B2 




Cytochrome P450 2C1 1 




r^\J A t/"\r\/"vi crtTT>PT*i* OP T 

l-'in/a. lupuioumciaoc J 




rem nil xi'diaiii 




WaUiIIia'glUuuIiyi UaUopcpuUaaC 




Glutathione S-txansferase mu-2 




Glycine methyltransferase 




Heme oxygenase 




Hemoglobin alpha 1 chain 




Hepatocyte nuclear factor 4 




Interferon related developmental regulator IFRD1 
(PC4) 




Interleukin-1 beta 




Interleukin-18 




Iron-responsive element-binding protein 
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Matrix metalloproteinase-1 


1 


Methylacyl-CoA racemase alpha 


I 


Monoamine oxidase A 


I 


Mxl protein 


I 


Na/H antiDorter CArNm) 


I 


N-cadherin 




N-hydroxy-2-acetylaminofluorene sulfo transferase 
(ST1C1) 


1 


Organic anion transporter 3 


I 


Organic anion transporting polypeptide 1 


I 


Ornithine aminotransferase 


I 


Osteon on ti n 




RCT 165 


j 


RCT-128 




Apoptosis-regulating basic protein 


! 


RCT- 137 


1 


RCT- 143 


I 


RCT- 148 


j 


RCT- 149 




RCT- 161 




RCT- 166 




RCT- 179 


I 


RCT- 180 


I 


RCT-181 




RCT- 193 


I 


RCT- 197 




Vacuole membrane protein 1 




:RCT-22 


I 


•RCT-228 


I 


RCT-242 


I 


RCT-244 


I 


RCT-26 




RCT-260 


I 


RCT-264 


I 


RCT-280 


I 


RCT-284 


1 


RCT-288 




RCT-295 




RCT-38 




RCT-45 




RCT-62 




RCT-64 




RCT-99 




Poly(ADP-ribose) polymerase 




Protein tyrosine phosphatase alpha 
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Pyruvate kinase, muscle 




Ribosomal protein S8 




Ribosomal protein S9 




Sarcoplasmic reticulum calcium ATPase 




Thioredoxin-1 (Trxl) 




Tryptophan hydroxylase 




UDP-glucuronosyl transferase 




Urokinase plasminogen activator receptor 




Vascular endothelial growth factor 





* Combination category is the number of training/test set gene list occurrences. 
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Table 28 Kidney Toxicity Compound-Dose Prediction Values for 72 Hour Data 
Predictive Genes (Combined List and Subsets) 



Gene Set 


Number 
of Genes 


Accuracy** 


Prediction 
False Positive** 


Measure* 
False Negative** 


Geometric Mean** 


Combo 
All 


225 


0.882 (0.643-0.974) 


0.086 (0-0.364) 


0361 (0.167-0.75) 


0.747 (0.500 - 
0.913) 


Combo 6 


16 


0.808 (0.607-0.902) 


0.166 (0-0.455) 


0.444(0.167-1.0) 


0.601 (0-0.869) 


Combo 5 


27 


0.742(0.429-0.921) 


0.228(0.026-0.591) 


0.486 (0.333-0.75) 


0.616 (0.452-0.803) 


Combo 4 


23 


0.828 (0.5-0.917) 


0.138 (0-0.545) 


0.486(0.25-1.0) 


0.607 (0-0.839) 


Combo 3 


33 


0.705 (0.357-0.902) 


0.226(0.027-0.591) 


0.722 (0.5-1.0) 


0.414 (0-0.649) 


Combo 2 


41 


0.661 (0.357-0.868) 


0.288(0.031-0.591) 


0.681 (0.333-1.0) 


0.412 (0-0.690) 


Combo 1 


90 


0.783(0.536-0.941) 


0.179(0.027-0.455) 


0.500(0.167-1.0) 


0.572 (0-0.896) 



* Prediction measures are given as means and range of values (in parentheses) for six 
training/test sets using 72 hour array data and gene lists. Unit of prediction was the 
animal and the predictive classification was for kidney tubular necrosis observed at 72 
hours after treatment. 



** Standard prediction measures were used as defined in Materials and Methods. As 
described in Materials and Methods In these analyses cases where no prediction was 
made because the p-value ratio exceeded the cutoff-value (generally 0.5) the non-call was 
considered to be incorrect. 
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Table 29 Predictive Performance of Various Models 







Model Performance using 6 Sets 




















Models 






Testing Sets 










Setl 


Set A 


Set 3 


Set 2 


Set5 


Set 4 


Mean 


KNN (Log Trans) 


0.92489 


0.878164 


0.86155 


0.850047 


0.952774 


0.739369 


0.867799 


Logistic 


0.828702 


0.60604 


0.851969 


0.803219 


0.74162 


0.802773 


0.772387 


Centroid 


0.863092 


0.892898 


0.61051 


0.596941 


0.849274 


0.762296 


0.762502 


Nnet (Log Trans) 


0.831605 


0.83795 


0.676123 


0.722401 


0.703167 


0.663883 


0.739188 


Logistic (Log Trans) 


0.826S18 


0.603062 


0.847566 


0.753487 


0.625389 


0.551093 


0.701186 


Tree 


0.537733 


0.879581 


0.921401 


0.794245 


0.544671 


0.516398 


0.699005 


Nnet 


0.769916 


0.83395 


0.565445 


0.714419 


0.667083 


0.607362 


0.693029 


Mean 


0.797494 


0.790235 


0.76208 


0.747823 


0.726283 


0.66331 1 




















Performance Measure = Geometric Mean of the True Positives and True Negatives 




Best Performance in Bold 












Centroid values are averaged over S runs 
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Table 30 Logistic Discrimination Coefficients 





Absolute Value of 
Coefficient 


Coefficient 


PAR interacting.protein 


3948.7722 


3948.7722 


RCT-145 


1756.2178 


-1756.2178 


Gaddl53 


1502.4772 


1502.4772 


Ribosomal protein L13A 


1497.8289 


-1497.8289 


Alpha tubulin 


1060.1632 


1060.1632 


Cathepsin L sequence 2 


821.1935 


821.1935 


RCT271 


564.5671 


-564.5671 


c-myc 


514.0376 


-514.0376 


Uncoupling protein 2 


483.928 


483.928 
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Table 31 Prediction of Kidney Toxicity for Samples External to Database 



Predicting 
Gene Set* 


Treatment 


/Animal 


rTcuicuon 


P-Value 
Ratio 


Prediction 

V olUCS 

No Votes 


NoP- 
Value 


Yes 
Votes 


YesP 
Value 


Combo 6 


v>epnaionaine i juvj mg/Kg l.p. 
24 h 


501 


yes 


0.000 


0 


1 


10 


0 


Combo 6 


i^cpnaionaine ijuu mg/Kg i.p. 
24 h 


506 


yes 


0.000 


0 


1 


10 


0 


Combo 6 


f^AnnrtlAnninA 1 ^ fill mn/l^n i 

^epnaionaine i juu mg/Kg i.p. 

94 h 

&t n 


508 


yes 


0.000 


0 


1 


10 


0 


KsOTliUQ O 


v^ispiaun mg/Kg i.p. zh- n 


609 


yes 


n ono 


9 


I 


o 
o 


o 

V 




ficnlntin 90 mo/to i n 94 h 


601 


VAC 

yea 


o ono 


o 


1 
A 


10 


n 




V^lopiallll lllgrNg l.p. II 


604 




0 000 


o 




10 


o 




















Combo 5 


i^epnaionuine i juu mg/Kg i.p. 
24 h 


501 


yes 


0.001 


4 


1 


6 


0.001 


Combo 5 


v^cpiicLioiiUiuc i nig/ Kg i.p. 

24 h 


506 


yes 


0.000 


1 


1 


9 


0 


Combo 5 


¥ ^rt** It f\l /iina ' 1 ^ fill rrl *"» §\r tw i n 

v^epnaionaine 1 juu mg/Kg i.p. 

94 h 


508 


yes 


0.000 


2 


1 


8 


0 


v^umuu *} 


fNcnlntin 90 mafVcj i n 94 h 
V^I&piaLlII JAJ IIIg/Kg l.p. Art 11 


609 


UPC 

yea 


0 208 


7 


0 945 




0 197 




f'icnlatin 90 mtr/lfcr i r\ OA H 


601 




0 908 


7 


0 94*5 




0 197 


PftfinKA ^ 

Luinuo j 


fNcnlariri 90 rr\afVct i n 94 h 
\_*lapiallll Z\f mg/Kg l.p. £M 11 


604 


yea 


0 001 


4 


1 
1 


6 


0 001 

v.V/V 1 .. 




















Combo 4 


v^epnaionainc i juu mg/Kg i.p. 
24h 


501 


yes 


0.000 


i 


1 


9 


0 


Combo 4 


■ ^A-nn al Am j"t<* mo 1 ^ fill wirt/lrrr i 

v^epnaionuine i juw mg/Kg l.p. 
24h 


506 


yes 


0.000 


2 


1 


8 


0 


Combo 4 


f^^Tlh fllrtfi 1 ^OO TTlO/lVci 1 Tl 

V^Cpil tUUI lull 1C 1JUU Illg/K.g l.p. 

24. h 

**T 11 


508 


yes 


0.000 


0 


1 


10 


0 


{"Yifnhn 4 

\~AJXX\\J\J *T 


Pisnlarin 90 ttih/Ic a i n 94 h 


602 

\J\J £* 


jr to 


0.010 


5 


0.999 


5 


0.01 


fYvrnhr* 4 

V^UIUUU *T 


Pi^nlnHn 90 moYka i n 94 h 

X^ldpiailll C\J lllg/rwg l.p. it 


601 




0 000 


i 


1 

1 


9 


o 


wUIllUl/ *t 


PicnlnHn 90 mcr/lrcT i n 94 h 
wlaplailll IlJg/Kg I.p. it 11 


604 


VAC 


0 000 


1 
1 


1 
1 


o 


0 




















Combo 3 


Cephaloridine 1500 mg/kg i.p. 
24 h 


501 


yes 


0.001 


4 


1 


6 


0.001 


Combo 3 


Cephaloridine 1500 mg/kg i.p. 
24 h 


506 


yes 


0.208 


7 


0.945 


3 


0.197 


Combo 3 


Cephaloridine 1500 mg/kg i.p. 
24 h 


508 




0.606 


8 


0.803 


2 


0.487 


Combo 3 


Cisplatin 20 mg/kg i.p. 24 h 


602 


yes 


0.208 


7 


0.945 


3 


0.197 


Combo 3 


Cisplatin 20 mg/kg i.p. 24 h 


603 


yes 


0.001 


4 


1 


6 


0.001 


Combo 3 


Cisplatin 20 mg/kg i.p. 24 h 


604 


yes 


0.055 


6 


0.99 


4 


0.055 




















Combo 2 


Cephaloridine 1500 mg/kg i.p. 


501 


yes 


0.000 


3 


1 


7 


0 
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24 h 
















Combo 2 


v^cpniuonuiiic uw liig/ivg i.p. 
24 h 


506 


yes 


0.000 


3 


1 


7 


0 


Combo 2 


v^pnai untune uuv jng/Ag i.p. 
24 h 


508 


yes 


0.000 


3 


1 


7 


0 


uomoo jl 


uspiatin mg/Kg i.p. zh- n 


VV6 




0010 


5 


0.999 


5 


0.01 


uomoo jl 


cisplatin zu mg/Kg i.p. z*f n 


\}\JJ 


VAC 


0 000 




1 


7 


o 


combo 2 


cisplatin zu mg/Kg l.p. *a n 


Dvrr 


yes 


0000 


it 


1 


8 


o 




















Combo 1 


Cephaloridine 1500 mg/kg i.p. 
24 h 


501 


yes 


0.000 


1 


1 


9 


0 


Combo 1 


Cephaloridine 1500 mg/kg i.p. 
24 h 


506 


yes 


0.000 


1 


1 


9 


0 


Combo 1 


Cephaloridine 1500 mg/kg i.p. 
24 h 


508 


yes 


0.000 


3 


1 


7 


0 


Combo 1 


Cisplatin 20 mg/kg i.p. 24 h 


602 


yes 


0.001 


4 


1 


6 


0.001 


Combo 1 


Cisplatin 20 mg/kg i.p. 24 h 


603 


yes 


0.000 


3 


1 


7 


0 


Combo 1 


Cisplatin 20 mg/kg i.p. 24 h 


604 


yes 


0.000 


3 


1 


7 


0 



* All genes used for Combo Gene Lists. 

** Prediction values are output from prediction program. Values include prediction 
(yes=kidney toxicity predicted, no=no kidney toxicity predicted), numbers of yes and no 
votes from 10 nearest neighbors, the p-value for the no and yes votes and the p-value 
ratio for the predicted class over the not predicted class. A p-value ratio cutoff of 0.5 was 
used 
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Table 33 Kidney Predictive Genes (376 genes) 
Organized by Time Point and Combo Category* 


Gene 


6h 


24h 


72h 


60S ribosomal protein L6 (alternate clone 1) 


Combo I 


Combo 6 


Not Found 


RCT-171 


Not Found 


Not Found 


Combo 1 


Preproatbumin, sequence 2 (alternate clone 1) 


Not Found 


Combo 4 


Not Found 


Hemoglobin alnha 1 chain ( alternate clone) 


Combo 1 


Not Found 


Combo 1 


Pancreatic secretory trypsin inhibitor type II (PSTI-II) 
(alternate clone) 


Not Found 


Combo 3 


Combo 4 


14-3-3 zeta 


Combo 5 


Combo 1 


Combo 4 


RCT-139 


Combo 3 


Not Found 


Not Found 


25-DX 


Not Found 


Not Found 


Combo 2 


25-hydroxyvitaminD3-l alpha-hydroxylase 


Not Found 


Not Found 


Combo 3 


3-beta-hydroxysteroid dehydrogenase (HSD3B1) 


Not Found 


Not Found 


Combo 1 


3-hydroxyisobutyrate dehydrogenase 


Not Found 


Not Found 


Combo 3 


3-methyladenine DNA glycosylase 


Not Found 


Not Found 


Combo 3 


SOS ribosomal protein L6 


Not Found 


Combo 5 


Combo 1 


Acetylcholine receptor epsilon 


Combo 1 


Not Found 


Not Found 


Activating transcription factor 3 


Not Found 


Not Found 


Combo 4 


Activin receptor type II 


Not Found 


Combo 2 


Combo 2 


Acyl-CoA dehydrogenase, medium chain 


Not Found 


Combo 1 


Combo 1 


ADP-ribosylation factor-like protein ARL184 


Combo 5 


Not Found 


Combo 1 


Adrenodoxin reductase 


Not Found 


Combo 1 


Not Found 


Alanine aminotransferase 


Not Found 


Not Found 


Combo 6 


Alcohol dehydrogenase 1 


Not Found 


Combo 1 


Not Found 


Aldehyde dehydrogenase 1 


Combo 1 


Not Found 


Not Found 


Aldehyde dehydrogenase 2 


Combo 5 


Not Found 


Combo i 


Aldehyde dehydrogenase, microsomal 


Not Found 


Not Found 


Combo 2 


Alpha- 1 acid glycoprotein 


Combo 1 


Not Found 


Not Found 


Alpha- 1 microglobulin/bikunin precursor (Ambp) 


Combo 2 


Not Found 


Combo 4 


alpha- 1 ,2-f ucosyltransferase 


Combo 4 


Not Found 


Not Found 


Alpha-2-macroglobulin 


Not Found 


Combo 1 


Not Found 


Alpha-fibrinogen 


Combo 1 


Combo 5 


Combo 5 


Alpha-tubulin 


Combo 6 


Combo 6 


Combo 6 


Annexin V 


Not Found 


Combo 3 


Combo 3 


Apolipoprotein CHI 


Combo 1 


Not Found 


Not Found 


Aquaporin-3 (AQP3) 


Combo 4 


Not Found 


Not Found 


Argininosuccinate lyase 


Combo 1 


Not Found 


Not Found 


Arginosuccinate synthetase 1 


Not Found 


Combo 1 


Not Found 


Aryl hydrocarbon receptor 


Not Found 


Not Found 


Combo 1 


Aspartoacylase 


Combo 2 


Combo 3 


Combo 1 


ATP-stimulated glucocorticoid-receptor translocation 
promoter (Gyk) 


Combo 1 


Combo 4 


Not Found 


Bax (alpha) 


Not Found 


Not Found 


Combo 3 


Bcl-2 


Combo 3 


Combo 1 


Not Found 


Beta-actin, sequence 2 


Not Found 


Combo 5 


Combo 6 


Beta-tubulin, class I 


Combo 5 


Combo 5 


Combo 6 
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Calhindin-D (W\ 


Combo 1 


Not Found 


Not Found 


f^atpinpnrin-Tl 


Not Found 


Not Found 


Combo 1 


CaJnexin 


Not Found 


Combo 1 


Not Found 


Calnactin I heavv chain 


Combo 3 


Combo 6 


Combo 5 


Calreticulin 


Combo 6 


Combo 3 


Combo 1 


Canalicular rrmlticnecific oreanic anion tranSDOrter 


Not Found 


Combo 5 


Not Found 


r^firhamvl nhrtcnhntp cvnthptaQft I 


Combo 1 


Not Found 


Not Found 


Harhonic anhvdrase III seauence 2 

VlU UVIUV 41111! V 11 J OJv 111 f iJWMMVIt™ 


Not Found 


Combo 5 


Combo 1 


f^nrKnnvl rpHiir*t5icp 


Not Found 


Combo 1 


Combo 3 


f^ncpin.alnhn 

OttdCIH'illUIla 


Not Found 


Combo 2 


Not Found 




Not Found 


Not Found 


Combo 3 


LBspase / 


fVimho 1 


Not Found 


Not Found 


faf hpncin T 


Combo fi 


Combo 6 


Combo 1 


f^fitfipncin T cp/iiipticp 7 


Cnmhn 4 


Combo 6 


Combo 2 




Not Found 


Combo 3 


Combo 2 




Not Found 


Not Found 


Combo 2 




Combo 1 


Combo 4 


Combo 5 


CDK102 


Not Found 


Combo 2 


Not Found 


CDK108 


Not Found 


Combo 6 


Not Found 


Cellular nucleic acid binding Drotein (CNBP) 


Not Found 


Combo 2 


Combo 1 


Cerulonlasrnin 


Not Found 


Combo 4 


Combo 5 




Combo 3 


Not Found 


Not Found 


Hhnlpctprol 7-atnha-hvdroxvlase fP4^0 VTT^ 

L*I ILHCa IC1 <J1 / "<ll|Jil<l 11 jrUI UA jr luot \x "r«l V v LLJ 


Combo I 


Not Found 


Not Found 


Hholpstprol esterase 

^1 IU 1 wtl Iwl Ul WiJ Lvl CUv 


Not Found 


Combo 1 


Not Found 


c~H-ras 


Combo 6 


Not Found 


Combo 1 


un 


Combo 1 


Not Found 


Combo 3 


niiictprin 

l>> I UDlVl 111 


Not Found 


Combo 6 


Combo 4 


c-myc 


Combo 1 

VUlllwv X 


Combo 6 


Combo 5 


OUIUIiy-oUlllUlaUllg lal*LUJ~l 


Combo 2 


Not Found 


Not Found 


Cornnlernent comnonent C3 

^rvi ll|/lvl liWIi vUllipwilVUI V^«r 


Not Found 


Combo 2 


Combo 4 


Connexin-32 


Combo 3 


Combo 4 


Combo 1 




Not Found 


Not Found 


Combo 2 


CvclinDl 

O jr Willi XX l 


Not Found 


Not Found 


Combo 2 


r*vf1in denendent kinase 2 


Not Found 


Not Found 


Combo 2 


fvclin denendent kinase 4 


Combo 1 


Not Found 


Combo 4 


Cvclin E 

^jrviiii Xj 


Combo 6 


Not Found 


Not Found 


Cvclin G 


Not Found 


Not Found 


Combo 3 


Cyclooxygenase 2 


Not Found 


Not Found 


Combo 2 


Cvstatin C 


Not Found 


Not Found 


Combo 1 


Cytochrome c oxidase subunit II 


Not Found 


Not Found 


Combo 2 


Cytochrome c oxidase subunit IV 


Combo 1 


Not Found 


Not Found 


Cytochrome P450 14DM 


Not Found 


Combo 1 


Not Found 


Cytochrome P450 1A1 


Combo 3 


Not Found 


Combo 2 


Cytochrome P450 IBi 


Not Found 


Not Found 


Combo 1 


Cytochrome P450 2 A3 


Not Found 


Combo 1 


Not Found 


Cytochrome P450 2B 1/2B2 


Not Found 


Not Found 


Combo 1 


Cytochrome P450 2C11 


Combo 1 


Combo 1 


Combo 1 


Cytochrome P450 2C23 


Not Found 


Combo 1 


Combo 3 
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D-dopachrome tautoroerase 


Not Found 


Not Found 


Combo 3 


Decorin 

i/vvvl 111 


Combo 5 


Not Found 


Not Found 


Defender against cell death- 1 


Not Found 


Combo 2 


Not Found 


Diacylglycerol kinase zeta 


Not Found 


Not Found 


Combo 2 


D i methy larg i ni ne d i methyl am i nohydrol ase 


Not Found 


Combo 3 


Combo 3 


DNA binding protein inhibitor ID2 


Not Found 


Combo 1 


Combo 3 


DNA topoisomerase I 


Combo 1 


Combo 2 


Combo 1 
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* A Combo entry number indicates that the gene was on the predictive list for that time 
point and the number of occurrences of that gene on optimal combined training/test set 
lists. "Not Found" indicates that the gene was not on the optimal combined list for that 
time point. 
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Table 34 RCT genes (ESTs) Predictive for Kidney Tubular 
Necrosis: Best Homology Matches 



Gene Name 




RCT- 10 


Rattus norvegicus methylmalonate semialdehyde dehydrogenase gene 
(Mmsdh) 


RCT- 101 


no significant homology found 


RCT- 102 


Mouse pentylenetetrazol -related mKJNA riL-H yo u ik or co.i; 


RCT- 103 


no significant homology found 


RCT- 108 


no significant homology found 


RCT-109 


Rattus norvegicus nesprin-1 mRNA 


RCT-HI 


Mus musculus B lymphoid kinase (Blk) 


RCT-12 


no significant homology found 


RCT-126 


Homo sapiens, clone MGC:9483 IMAGE: 39 1 990 1, mRNA 


RCT-127 


no significant homology found 


RCT-128 


Mus musculus angiopoietin-related protein 3 (Angptl3) 


RCT-129 


Mus musculus Nedd4 WW binding protein 4 (N4wbp4-pending), mRNA 


RCT-137 


Mus musculus adult male tongue cDNA 


RCT-138 


Mus musculus DAPIO (DaplO) gene 


RCT- 1 39 


no significant homology found 


RCT- 14 


Rat brain nicotinic receptor alpha 7 subunit 


RCT- 140 


Mouse 13 days embryo head cDNA, RIKEN full-length enriched library, 
clone:3100001I08 


RCT-141 


Mus musculus proteoglycan 3 (megakaryocyte stimulating factor, 
articular superficial zone protein) (Prg4) 


RCT- 142 


Mus musculus 18 days embryo cDNA, RIKEN full-length enriched 
library, clone:l 190008J14 


RCT- 143 


Homo sapiens NADH dehydrogenase (ubiquinone) Fe-S protein 8 (23kD) 
(NADH-coenzyme Q reductase) (NDUFS8) 


RCT-144 


Mus musculus, similar to nucleolar protein (KKE/D repeat), clone 
IMAGE:3491448, mRNA, partial cds. 


RCT-145 


Mus musculus 10 day old male pancreas cUN A, KllvcJN ruii-iengtn 
enriched library, clone:1810014B19, full insert sequence 


RCT-146 


Mus musculus 8 days embryo cDNA, RIKEN full-length enriched 
library, clone:5730458E20 


RCT- 147 


Rattus norvegicus clone RP31-188L2 


RCT-148 


Mus musculus adult male kidney cDNA, RIKEN full-length enriched 
library, clone:0610010B16 


RCT- 149 


Mouse mRNA fragment for serum amyloid A (SAA) 3 protein 


RCT-151 


Mus musculus, Similar to sphingomyelin phosphodiesterase 1, acid 
lysosomal, clone MGC:1 1522 IMAGE:3964394 
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RCT-152 


Mus musculus, eukaryotic translation elongation factor 1 beta 2, clone 
MGC6763 IMAGE:3600850, mRNA, complete cds. 


RCT-153 


Mouse adult male cerebellum cDNA, RIKEN full-length enriched library, 
clone:1500015I13 


RCT-155 


Mus musculus type XV collagen mRNA 


RCT-158 


Rattus norvegicus cyclin-dependent kinase inhibitor IB 


RCT-161 


Mus musculus adult male spleen cDNA, RIKEN full-length enriched 
library, cIone:0910001D19 


RCT-162 


Mus musculus, clone IMAGE:3501507 


RCT-164 


Mus musculus adult male testis cDNA, RIKEN full-length enriched 
library, clone:4932443D16 


RCT-165 


Mus musculus adiponutrin (Adpn-pending), mRNA 


RCT-166 


Mus musculus, Similar to glutathione S-transferase theta 1, clone 
MGC:6769 IMAGE:3601446 


RCT-171 


no significant homology found 


RCT-177 


Mus musculus, Similar to peroxisomal delta3, delta2-enoyl-Coenzyme A 
isomerase, clone MGC:5644 IMAGE:3591615 


RCT-179 


Rat nucleolar protein B23.2 mRNA 


RCT-18 


no significant homology found 


RCT-180 


Mus musculus B-cell receptor-associated protein 37 (Bcap37 


RCT-181 


Mus musculus adult male testis cDNA 


RCT-182 


Rattus norvegicus gib mRNA for diacetyl/L-xylulose reductase 


RCT-185 


no significant homology found 


RCT-192 


Mus musculus 18 days embryo cDNA, RIKEN full-length enriched 
library, clone: 11 10033 Jl 9 


RCT-193 


no significant homology found 


RCT-194 


Mus musculus ectodermal-neural cortex 1 (End) 


RCT-196 


Homolous to Mus musculus 12 days embryo head cDNA, RIKEN full- 
length enriched library, clone:3010001M15 


RCT-197 


Rattus norvegicus Protein kinase, interferon-inducible double stranded 
RNA dependent (Prkr), mRNA 


RCT-198 


Mus musculus adult male kidney cDNA 


Di^T one 


no significant homology found 


RCT-206 


Homo sapiens, clone IMAGE:3867552 


RCT-207 


Mus musculus Ran binding protein 5 mRNA, partial cds 


RCT-211 


Mus musculus adult male kidney cDNA, RIKEN full-length enriched 
library, clone:0610009C22 


RCT-212 


Mus musculus nuclear localization signal protein absent in velo-cardio- 
facial patients (Nlvcf) 
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RCT-214 


Mus musculus putative NAD(P)H steroid dehydrogenase mRNA 


RCT-215 


Mus musculus RAB/Rip protein mRNA 


RCT-22 


Mus musculus, clone MGC: 19042 IMAGE:4 188988, mRNA 


RCT-220 


no significant homology found 


RCT-221 


no significant homology found 


RCT-228 


no significant homology found 


RCT-237 


Mmusculus mRNA for low density lipoprotein receptor 


RCT-24 


Mus musculus, tubulin alpha 8, clone MGC:28850 IMAGE:4507364, 
mRNA, 


RCT-240 


Mus musculus, clone MGC:7041 


RCT-241 


Mus musculus oncostatin receptor (Osmr), mRNA 


RCT-242 


Rattus norvegicus B-cell translocation gene 2, anti-proliferative(Btg2), 


RCT-244 


Mus musculus RIKEN cDNA 281Q408B13 gene 


RCT-245 


no significant homology found 


RCT-246 


no significant homology found 


RCT-251 


n (*\ ci tTni Tifunt nnmnlnou fi*\nnH 
ilVJ diglllllVultl llUlliUlUgj 1UUI1U 


RCT-252 




RCT-256 


Mus musculus, Similar to betaine-homocysteine methyltransferase 2, 
clone MGC:19186 JMAGE:4235455 


RCT-258 


Mus musculus, clone MGC:6139 IMAGE:3487295, mRNA 


RCT-260 


Mus musculus adult male hippocampus cDNA, RIKEN full-length 
enriched library, clone:2900024P20 


RCT-264 


Mus musculus sodium-sulfate cotransporter (Nasi) gene 


RCT-268 


iviuu&c duun iiiaic nver cj-/in/\, rvirwuiN run-iengin enncneu iiorary, 
clone- 130001 7J02 


RCT-271 


Homlogous to Mus musculus, clone MGC:27581 IMAGE:4489072, 
mRNA 


RCT-274 


Rattus norvegicus Clusterin (Clu) 


RCT-276 


Homo sapiens KIAA1224 protein 


RCT-277 


no significant homology found 


RCT-279 


no significant homology found 


RCT-28 


no significant homology found 


RCT-280 


Mus musculus carbohydrate (keratan sulfate Gal-6) sulfotransferase 1 
(Chstl) 


RCT-28 1 


Mus musculus, Similar to TNF-induced protein, clone MGC:1 1714 




Homo sapiens complement component Clq receptor (C1QR), mRNA 


RCT-287 


Mus musculus adult male kidney cDNA clone:0610010I20 


RCT-288 


no significant homology found 


RCT-291 


no significant homology found 


RCT-292 


Rattus norvegicus 2'5' oligoadenylate synthetase-2 


RCT-293 


Mus musculus 18 days embryo cDNA, RIKEN full-length enriched 
library, clone: 1 1 1002 1C22 


RCT-296 


Mus musculus corticosteroid binding globulin (Cbg) 
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RCT-31 


Mouse 10, 1 1 days embryo cDNA, RIKEN full-length enriched library, 
done:2810437P06 


DpT OA 


no sign meant nomoiogy rouna 


PPT 
M\K* i -JO 


no signincani nomoiogy rouna 


RCT-38 


ivlus muscuius Deiaine-nomocysieine mcuiyiuansierase z yDnmiz j 

mPM A 


PPT ^0 


no significant homology found 


PPT_/lfl 


Rattus norvegicus Cathepsin C (dipeptidyl peptidase I) (Ctsc) 




ivlus muscuius oi/\i jd ^otaoD; 




no significant homology found 


RCT-45 


ivlus muscuius rNeua*f-Dinaing Drain specinc proiein d-cain nusiNA, 
partial cas 




iNO maicn wim score aoove zuu 


PPT <in 


ivlus museums norooiasi growin iacior reguiaieu proiein z 


RCT-53 


no significant homology found 


PPT 


no significant homology found 


PPT £ 


supr=serum deprivation response [mice, INIHJI j ceils, mKiNA, zvuv ntj 


PPT £A 

1 -OU 


Mouse. Similar to tyrosyl-tRN A synthetase, clone MGC: 1 9350 


npT < 1 

KC1-01 


no significant homology found 


KC 1 -OZ 


no significant homology found 


1 -o4 


no significant homology found 


1 -OO 


M. muscuius mRNA for low density lipoprotein receptor 


PPT /CQ * 

Kl^ 1 -Oo . 


Rattus norvegicus nucleosome assembly protein mRNA 


RCT-69 


It Alio miic/'iiliiP DIVCW aT\MA HA 1 Anil! IO nona /,1 AM a XACU^'^^Af^X 

mus muscuius, KiisJbiN cuina uoiuuj^uiy gene, cione MLrt-..ZD4oo 

TKA A PIP»/M <Q">Q^ 
liVL/\.VXC. < t*f JOL7U 




no signincani nomoiogy louna 




no aigniiicaui nomoiogy lounu 


RPT.76 


no signincani nomoiogy luunu 


PPT-R 


ivicsscngcr ivi ir\ lur rai prcproaioumin 


RCT-80 


no significant homolopv found 


RCT-83 


no significant homology found 


RCT-84 


no significant homology found 


RCT-87 


Mus muscuius adult male tongue cDNA 


RCT-88 


no significant homology found 


RCT-89 


no significant homology found 


RCT-91 


no significant homology found 


RCT-92 


no significant homology found 


RCT-94 


Rattus norvegicus Glutamate receptor, metabotropic 5 (Grm5) 


RCT-99 


no significant homology found 



* Homologies are given from BLAST searches using the Phase 1 RCT sequence as the 
query sequence and GenBank NR database as the target sequence database. The best 
BLAST homology sequence observed is given. In general, no significant homology 
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indicates that no BLAST match was observed with a BIT score >100, BLAST searches 
in this category were conducted as recently as February, 2002. 
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Table 35 Fifty-three Genes that are Predictive at ail Three Time Points 



j Gene 


6h 


24h 


72h 


A 1 nfi n-t 1 1K1 1 1 i n 


Combo 6 


Combo 6 


Combo 6 


A cosirtono vl as p 


Combo 2 


Combo 3 


Combo 1 


Rptn-tiihnlin r1n<:c T 


Combo 5 


Combo 5 


Combo 6 


r^alnnr^tiTi 1 Y\c*cxvf\t r*haiTi 
kodipauun l liCovy uiaill 


Combo 3 


Combo 6 


Combo 5 


f^sil r#*t'if , ii1iT» 


Combo 6 


Combo 3 


Combo 1 


OaXllwpalll Li 


Combo 6 

vvlllvv w 


Combo 6 


Combo 1 




Comho 4 


Combo 6 


Combo 2 


LutH mcidoiaoio oupprcisaur gene 


Combo 1 


Combo 4 


Comho S 


c-myc 


Cnmbo 1 


Comho 6 


Comho S 


florin ay in "XO 


Combo 3 


Comho 4- 


Comho 1 


fN/trw htwrvio PA*\ft *)C*\ 1 


Comho 1 


ComHo 1 


Comho 1 


L'l^irv. lupoi sonic raac i 


Comho 1 


Combo 2 


Comho 1 


r\i /noiTi lirrHt phoin 1 
Uyilclll Hgni vllaill 1 


Combo 1 


Combo 6 


Comho 5 


Pr»trt_ A TPqca 
CClO-/\ 1 raSc 


Combo 3 


Comho 3 


Comho 3 


epidermal grow in i actor 


Comho S 


Comho 4 


Comho 3 


rcrnun n-viiain 


Combo 2 


Combo 4> 


Comho 1 


uaJUilia'glUlainyj UauSpcpilUaSC 


Pnmhn S 


Combo 1 


Comho 1 




Combo 3 


Combo 1 

Vvlllvv X 


Comho 1 


Hypoxanthine-guanine phosphoribosyltransferase 


Combo 4 


Combo 4 


Combo 5 


LgJD oiTiQing proicin 


Comho 1 


Comho S 

V^VilJXW -J 


Comho S 


insuiin~iiK& growm i actor uinuuig piuicm i 


Cnmrin 

V'VltllUV/ VJ 


Comho fi 


Comho 6 


iiiicrieuiun-i ueut 


Pnmhn A 


Comho A 


Comho 1 




Combo 2 


Combo 2 


Combo 2 


IVxaUiA JllClallUpiUlCIIlaoC'l 


Combo 1 


Combo 4 


Comho 1 


Y/f**th vl a/*v1 _fV% A rarpmaci* nlnha 
LVxCUl jlaCji'v^U/v iaCCIIlaaO aipila 


Cnmho 1 


Comho 3 


Combo I 


MTIC rla^ T antiapn RT1 Alffi ainha-chain 


Combo 2 


Combo 5 


Combo 6 


NAnlriHnio rpQi^tsint nrotAin-3 


Combo 6 


Combo 5 


Combo 6 


L i <*/ XV. nix aac ajpiia-i 


Combo S 


Combo 1 


Combo 4 




Combo 3 


Combo 1 


Combo 1 


M-hvHrrtY v-^-flf*f*tvlflminftfliif*rpnp siilfo transferase 

(ST1C1) 


Combo 3 


Combo 2 


Combo 1 


Ornithine aminotransferase 


Combo 1 


Combo 2 


Combo 1 


RCT-109 


Combo 4 


Combo 6 


Combo 3 


RCT-127 


Combo 2 


Combo 2 


Combo 5 


Apoptosis-regulating basic protein 


Combo 2 


Combo 1 


Combo 1 


RCT-166 


Combo 2 


Combo 1 


Combo 1 


RCT-179 


Combo 2 


Combo 5 


Combo 1 


RCT-180 


Combo 2 


Combo 4 


Combo 1 


RCT-182 


Combo 3 


Combo 5 


Combo 5 


RCT-211 


Combo 2 


Combo 5 


Combo 6 


RCT-24 


Combo 4 


Combo 6 


Combo 6 


RCT-240 


Combo 3 


Combo 4 


Combo 6 
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RCT-280 


Combo 1 


Combo 2 


Combo 1 


RCT-49 


Combo 4 


Combo 5 


Combo 5 


RCT-50 


Combo 5 


Combo 5 


Combo 5 


RCT-68 


Combo 1 


Combo 6 


Combo 4 


Ribosomal protein LI 3 A 


Combo 5 


Combo 6 


Combo 5 


Sarcoplasmic reticulum calcium ATPase 


Combo 1 


Combo 1 


Combo 1 


Stathmin 


Combo 3 


Combo 2 


Combo 4 


Superoxide dismutase Mn 


Combo 5 


Combo 4 


Combo 4 


Thymosin beta- 10 


Combo 5 


Combo 5 


Combo 5 


Tissue inhibitor of metalloproteinases-1 


Combo 2 


Combo 6 


Combo 5 


Uncoupling protein 2 


Combo 4 


Combo 6 


Combo 5 


Zinc finger protein 


Combo 5 


Combo 5 


Combo 4 
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Table 36 Twenty-three Genes that are the most predictive across the time points 



Gene 


6h 


24h 


72h 


Ajpna- lu Dull n 




Pnmhn fi 


Pnmhn fi 


DCla~lUuUillj, Claoa 1 


Pnmhn S 


Pnmhn 5 


Pnmhn fi 


L*ainepsin l. 


P'nmVwi £\ 


Pnmhn f\ 


Pnmhn 1 


Uainepsin sequence z 


^UIIIUU *t 


Pnmhn 


Pnmhn 9 


c-myc 


PV\mVv"» 1 
LUJUUU 1 


Pi*»TnHr> 
V/UIUUU SJ 




cpiaerniai growtn xacior 


PrvmHrt S 
wUlIlUU J 




Pnmhn ^ 


Hypoxanthine-guanine 
pnospnonDOsyicransierase 


Combo 4 


Combo 4 


Combo 5 


Lgc Din cung proiein 


\~t\Jllt\J\J £m 


Pnmhn S 




insulin ~iikc gruwui i<u,iui uiuuiJig piuicni i 


Pnmhn fi 


Pnmhn fi 


Pnmhn 6 


Tn tp»r1 p»i 1 l/"t « _ 1 V%ots* 

inicricuKin~x ucia 


Pnmhn d 


Pnmhn 4 


Pnmhn 1 


MultidniP resistant nrotein-3 


Combo 6 


Combo 5 


Combo 6 


RCT-211 


Combo 2 


Combo 5 


Combo 6 


RCT-24 


Combo 4 


Combo 6 


Combo 6 


RCT-240 


Combo 3 


Combo 4 


Combo 6 


RCT-49 


Combo 4 


Combo 5 


Combo 5 


RCT-50 


Combo 5 


Combo 5 


Combo 5 


RCT-68 


Combo 1 


Combo 6 


Combo 4 


Ribosomal protein LI 3 A 


Combo 5 


Combo 6 


Combo 5 


Superoxide dismutase Mn 


Combo 5 


Combo 4 


Combo 4 


Thymosin beta- 10 


Combo 5 


Combo 5 


Combo 5 


Tissue inhibitor of metalloproteinases-1 


Combo 2 


Combo 6 


Combo 5 


Uncoupling protein 2 


Combo 4 


Combo 6 


Combo 5 


Zinc finger protein 


Combo 5 


Combo 5 


Combo 4 
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Table 37 Kidney Toxicity Predictive Genes Whose Protein Products Are Known to be 

Secreted 

Ceruloplasmin 

Colony-stimulating factor- 1 

Complement component C3 

Cystatin C 

Epidermal growth factor 

Ferritin H-chain 

Fibrinogen gamma chain 

Interleukin-1 beta 

Interleukin-10 

Interleukin-1 8 

Keratinocyte growth factor 

Macrophage inflammatory protein- 1 alpha 

Macrophage inflammatory protein-2 alpha 

Major acute phase protein alpha- 1 

Mulierian inhibiting substance 

NGF-inducible antiproliferative putative secreted protein (PC3) 

Pancreatic secretory trypsin inhibitor type II (PSTI-II) 

T-cell cyclophilin 

Thioredoxin-1 (Trxl) 

Tissue factor 

Tissue inhibitor of metalloproteinases-1 

Transferrin 

Vascular endothelial growth factor 



Table 42 Summary Output of Predictive Computer Software Product 


Sample 


Slide 
Number 


Tissue 


Dose 


Time 


Prediction 


Certitude 


paraquat 


16477 


Rat Kidney 


25 mg/kg 


24h 


Kidney 
Tubular 
Necrosis 


0.472 


paraquat 


16478 


Rat Kidney 


25 mg/kg 


24h 


Negative 


0.999 


paraquat 


16479 


Rat Kidney 


25 mg/kg 


24h 


Kidney 
Tubular 


0.796 
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Necrosis 




phenobarbital 


11494 


Rat Kidney 


80 mg/kg 


24h 


Negative 


0.999 


phenobarbital 


11495 


Rat Kidney 


80 mg/kg 


24h 


Negative 


0.999 


phenobarbital 


11496 


Rat Kidney 


80 mg/kg 


24h 


Negative 


0.999 



Table 43 Detailed Output of Predictive Computer Software Product 



Sample paraquat 16477 RatKidney 25me/ke 24h 503r#3132 


Predictagen 


Performance 


Kidney Tubular 




Necrosis 


Negative 






24hKidneyCombol .txt 


1. 000 




0.752 


24hKidneyCombo2.txt 


1.000 






24hKidneyCombo3.txt 


1.000 




0.752 


24hKidneyCombo4.txt 


1.000 


0.584 




24hKidneyCombo5 .txt 


1.000 


0.997 




24hKidneyCombo6.txt 


1.000 


0.977 




Prediction: Kidney Tubular Necrosis with certitude 0.472 




Sample paraquat 16478 RatKidney 25mg/kg 24h 503r#3133 


Predictagen 


Performance 


Kidney Tubular 




Necrosis 


Negative 






24hKidneyCombo 1 .txt 


1.000 




0.752 


24hKidneyCombo2.txt 


1.000 




0.752 


24hKidneyCombo3.txt 


1.000 




0.752 


24hKidneyCombo4.txt 


1.000 




0.752 


24hKidneyCombo5 .txt 


1.000 




0.752 


24hKidneyCombo6.txt 


1.000 




0.752 


Prediction: Negative with certitude 0.999 



Sample paraquat 16479 RatKidney 25mg/kg 24b 503r#3134 


Predictagen 


Performance 


Kidnev Tubular 


Necrosis 


Negative 




24hKidneyCombol .txt 


1.000 


0.752 


24hKidneyCombo2.txt 


1.000 




24hKidneyCombo3.txt 


1.000 




24hKidneyCombo4.txt 


1.000 


0.882 


24hKidneyCombo5.txt 


1.000 


0.997 


24hKidneyCombo6.txt 


1.000 


0.999 


Prediction: Kidney Tubular Necrosis with certitude 0.796 



Sample phenobarbital 11494 RatKidney 80mg/kg 24h H375#2634 

Predictagen Performance Kidnev Tubular 

Necrosis Negative 

24hKidneyCombol.txt 1.000 0.752 
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24hKidneyCombo2.txt 


1.000 


0.752 


24hKidneyCombo3 .txt 


1.000 


0.752 


24hKidneyCombo4.txt 


1.000 


0.752 


24hKidneyCombo5 .txt 


1.000 


0.752 


24hKidneyCombo6.txt 


1.000 


0.752 


Prediction: Negative with certitude 0.999 




Sample Dhenobarbital 11495 RatKidnev 80mg/kg 24h H375#2635 


Predictaeen 


Performance Kidnev Tubular 




Necrosis 


Negative 




24hKidney Combo 1 .txt 


1.000 


0.752 


24hKidneyCombo2.txt 


1.000 


0.752 


24hKidneyCombo3.txt 


1.000 


0.752 


24hKidneyCombc4.txt 


1.000 


0.752 


24hKidney Combo5 .txt 


1.000 


0.752 


24hKidneyCombo6.txt 


1.000 


0.752 


Prediction: Negative with certitude 0.999 




Sample phenobarbital 11496 RatKidney 80me/kg 24h H375#2636 


Predictaeen 


Performance Kidnev Tubular 




Necrosis 


Negative 




24hKidney Combo 1 .txt 


1.000 


0.752 


24bXidneyCombo2.txt 


1.000 


0.752 


24hKidneyCombo3 .txt 


1.000 


0.752 


24hKidneyCombo4.txt 


1.000 


0.752 


24hKidneyCombo5 .txt 


1.000 


0.752 


24hKidneyCombo6.txt 


1.000 


0.752 


Prediction: Negative with certitude 0.999 





Table 44. Protein Marker Candidate Identification 



Gene Name 


Mean 
Overall 
Correct 
Calls* 


Codes 

for 
Protein 


Avg 
Neg 
H** 


Avg 
Pos 
Fi** 


Secreted 
















Mean 










Phase-l RCT-241 


79.9% 


yes? 


-0.02 


0.85 




Cathepsin L, sequence 2 


76.7% 


yes 


0.08 


1.19 




Phase-l RCT-145 


76.2% 


yes? 


-0.01 


0.41 




Cathepsin L 


76.0% 


yes 


0.10 


1.40 




60S ribosomal protein L6 


75.6% 


yes 


-0.06 


0.75 




Clusterin 


75.3% 


yes 


-0.02 


0.48 
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Osteopontin 



75.3% 



yes 



-0.08 



0.04 



Dynein light chain 1 



74.6% 



-0.06 



0.23 



issue inhibitor of metalloproteinases-1 74.0% 



yes 



0.08 



2.98 



Uncoupling protein 2 



73.7% 



yes 



-0.07 



0.95 



Ribosomal protein S9 



72.9% 



yes 



-0.03 



0.57 



Phase- 1 RCT-258 



72.5% 



yes? 



-0.02 



0.30 



{Ribosomal protein L6) 



71.4% 



0.00 



0.58 



Gaddl53 



70.8% 



yes 
yes 



0.08 



0.85 



Proliferating cell nuclear antigen gene | 70.5% 



Gadd45 



-0.06 



0.12 



69.9% 



yes 



0.03 



0.45 



Phase- 1 RCT-274 



Phase- 1 RCT-109 



69.8% 



69.8% 



0.03 



0.78 



0.00 



0.46 



Thymosin beta- 10 



67.8% 



yes 



-0.02 



0.49 



■myc 



67.7% 



yes 



0.14 



0.59 



Phase-1 RCT-158 



Insulin-like growth factor binding 
protein 1 



67.6% 



yes? 



0.01 



0.27 



Phase-1 RCT-179 



Multidrug resistant protein-3 



PAR interacting protein 



67.4% 



yes 



0.11 



1.87 



67.2% 



66.8% 



j«s? 



0.06 



yes 



0.62 



yes 



0.03 



0.02 



66.4% 



yes 



0.01 



0.37 



Phase-1 RCT-198 



Beta-actin, sequence 2 



66.1% 



0.05 



Phase-1 RCT-24 



65.8% 



65.4% 



yes 



0.01 



0.12 



0.28 



0.05 



0.03 



Alpha-tubulin 



Phase-1 RCT-152 



65.2% 



Phase-1 RCT-60 



64.9% 



yes 



0.00 



0.05 



64.7% 



yes? 



-0.05 



0.73 



0.05 



0.23 



Phase-1 RCT-68 



Keratinocyte growth factor 



64.5% 



0.06 



0.32 



63.2% 



yes 



0.07 



0.99 



Calpactin I heavy chain 



Alpha-fibrinogen 



62.5% 



62.2% 



yes 



0.06 



0.62 



yes 



0.09 



2.29 



Phase-1 RCT-49 



61.2% 



0.02 



0.37 



Phase-1 RCT-199 



IgE binding protein 



60.8% 



yes? 



60.0% 



yes 



-0.03 
-0.07 



0.08 



0.75 



^rfo« nt ACCUraCy * ****** sets for individ ^ g^e predictive 

ImplTs f ° ld indUCti °" relati W t0 ° = n ° indUCti ° n f0r ex P ression ^ kidney 
seated with nontoxic treatments (Neg FI) or treatments producing kidney toxicity 



158 



WO 03/100030 



PCT/US03/06196 



If 

u 

a* 



Iff 



it 



II liPiHillfPi i|i|lh-}!lifi * — clfti 

*f lilil § l 2fff Iji^f If fflillif II liiil- 

ill III II illif 1H iliftiitfliiliHtlll-tllflllfl]^ 

i filgfiflif tliisiilllliffillilllllf*f#llffllfl*lfl 



IS 1 1 

§-i . lis 

IS H* I I is I 

1* labial s5l ! 



159 



WO 03/100030 



PCTAJS03/06196 



S3fS5?a5-SSs-25S5lSSS??S??25=?5'5S??5S9S?a5-3B3SS5???55?5 

?5?a5?3-553555a?5-5S?3?f?3sa3?=??S?a?=fS53?5-93??55'555?? 
299S?a?S939?3B3S3^i3233^^S?553S^??$f3?S9^;-Ba3333??S?|f5 
i99a9^aef9953S933935SS33?3293f3B??$9aa?3?S2?29|33f9?s|??l 
3?93?39S95|aSSSas98-$93a9aa;-39S3??99S59a995?B5?3959|S?9?l 




160 



WO 03/100030 PCT7US03/06196 



539?sS359?593?5599^99S-??9995— Bf9395355S5953S3s9?S9?5?9? 



5 

I 



i hi i lis 

£ | § 78- if ~, ^ ||| a al i « s f il 

1 fl ! HI If JI I lijllt^sSl^SllssSif 1 § Il-iiili 
2**1,1 = f II !i II li iiif3?§§3 a §3§§§§!3ml 1 Iff I If s I 

girl S S£ * Tl S ~ J 5 f V y g uuoao.iiLa.aaa.CLaaiaaQ.iiQ.a. ? it? e'2 E 4 5 I 

00000000000 000000000000000000000 



161 



WO 03/100030 PCT/US03/06196 



M?S^S5?S53<B9?9^S^9MS939iSS923^f3!!9S59S9SMS9i?3;9 



;f3933$?3as99a9-S?3S??5S?3sS5^53a33?^?3S3-;3s^339?33§i3?? 




5 



9 

« 
1 



162 



WO 03/100030 PCT/US03/06196 




163 



WO 03/100030 PCT/US03/06196 

ff95S5§2525S?S???SS5??5??S??S2555???S?5??S5S 3^3?s?5?S§? 
««?292S?99?2«9«|3s9<«|5§99^9S2«f9f5«999^52?;??2?^?9295Sf5 




164 



WO 03/100030 



^9si^S9*?:?*?^ 

|559^929^?9 : ^9?a9'?9^|5^^999?^-T«s^f9?;5^9S3^?9^9 



mil 1 if i^iiiliiiiiiiiiiliiiigi iiiiiiiiiiiiiiiiiiiiiiiii 




165 



WO 03/100030 PCTAJS03/06196 



59^|^?399?3S?5$9;^;?9?SH??99?^9;St59?^9^5^2S59^? 



5 



'iliittilHIiiiiiiiiiiiiiliiiiliiiiiiiiiiiHiiliiiiiiiiii ? 



166 



WO 03/100030 PCT/US03/06196 



167 



WO 03/1Q0030 



PCT/US03/O6196 



i 

• ■ s ■ • . 

i i ; •• . •• •• 

s 

P$9fSf9??9«?9«|99«9?9f^$?^9f$??9-?=-9;?^^9|9«S^S»9f99«9s 



v v v v , w v * « 8 S » » a 8 8 55 8 8 ? * 9 ? f S ? ? 8 5 3 S 3 18 8 ft SI 35 <p 8 5 8 8 3 8 3 b 8 8 

iiJliliiiiiiilliiilllliliililliilllilllJillillillilllllll I 



168 



WO 03/100030 PCT/US03/06196 



a 




169 



WO 03/100030 



PCTAJS03/06196 



?§§???3?3?3??5?^ 

i 




170 



WO 03/100030 

5959939! 5*39?99?5?3939 
55?9'S???^95- 

53§?5555 MI5s;?^9^M 
j5j55|555M5^5;?5?553;5 
59535555?3?? i §559??5M.?§ 
555!9s5SS999^^S555995 



WO 03/100030 



PCT/US03/06196 




172 



WO 03/100030 PC17US03/06196 




173 




174 




175 




176 




178 




179 




181 




182 




183 




184 




185 




186 




187 




188 




189 




190 




192 




193 




194 




195 




196 




197 




198 




199 




200 




201 



WO 03/100030 PCT/US03/06196 




202 



WO 03/100030 



PCT/US03/06196 




203 



- WO 03/100030 



PCT/US03/06196 



iiiillililiiiliiiilfiliiiiiililllilii 
ill!liiiiistl|llliifllli!fltlli!lflli 
!!ijfHi!!i!iiiliiilil!:iiii|i|l!ilii 

liililiifliiiliiiiiiliiiiiiliiiiiliil 
!!!!i!!!!!filii!!!!!i!i!!i!!!ii!!!!!I 



204 




206 




207 




209 




210 




211 




212 




213 




215 



WO 03/100030 PCTYUS03/06196 



filiillllll 

liillilllii 

bop- dSo^oo 

!!il§Iiiil! 
111!!!!!! 

iiiiilliili 

ss.llilllsgi 

liiiniiiii 

illiiiiliii 

w —j «- ( © w © e o d d » 

lilliliiiil 

HlHaillil 

!iii!|if!l! 

iiiiiiiiiii 

illiiiiliii 

ifiillilili 

iiiiHMii: 

IIIIIIIIIII 

illiiiiliii 
ijiiiiiijii 

© d - -*'d d © d 3 -• 

HHiliill! 

llSiliitiil 



216 




217 




218 




219 




220 




' 222 




223 



WO 03/100030 



PCT/US03/06196 




224 




227 




228 




i 



230 




231 



WO 03/100030 



PCT7US03/06196 




232 




233 




234 




; 



WO 03/100030 PCT/US03/06196 



S83S 3 3538:-2 

iiillliilit 

HiiiiHlll 

1111(111111 

5§|||l!|lt| 

iiiiiiiiiii 

iiiiiillli! 
liliijiiiii 

iiiiiiiiiii 

iilllililll 
iiiiii 

iPlfPilP 
iSsi35 5 sil5 

Ifflliiiill 

ijiiilliljl 

iiiiiiiiiii 

iiiiiillli! 
iiiiiiiiiii 

d~dd-ddddd~ 

Iiiiiiiiiii 

iiiiiiiiiii 



236 




238 




239 




241 




242 




243 




244 



WO 03/100030 



PCTYUS03/06196 




245 



WO 03/100030 



PCTAJS03/06196 



ililiillilif!iii§iiliilli!ll!iilil§ii!iliiilliiiilii!ii!iiiiliiiiif!i!ii!l 

ii!ll!iiiiiiiiii!lli!lii!lilliii§liiliiiifli!iiiiii!iliiiiiii~iiiifliiiiii 

IsieillillliilElillliieillliliiliililllSllililiillliililliiiilliiiliiilll! 

df*-'©~'ddd~dd-~dd~d»-'dd d©d ddddd dciooo oodoodooo — o- o - - •- odo~o-«*»-»-~ — - o- ~ — 

i§igiiiiisiBlIii!liil2i§illiiii§Ilili 



246 



WO 03/100030 



PCT/US03/06196 



i • 



i 1 




247 



WO 03/100030 



PCT/US03/06196 



i '. t 




9 



248 




249 




250 




251 




PCTYUS03/06196 



ft 



252 




253 



WO 03/100030 



PCT7US03/06196 




256 




259 



WO 03/100030 



PCT/US03/06196 




260 




261 



WO 03/100030 PCT/US03/06196 



§ 8 



as- 



! 
i 



ii 
S3 



as 

si 



it 



i 



264 




265 




267 



WO 03/100030 PCT/US03/06196 



p 

ii 

2| 

i! 

I! 



if 

ii 

!! 
ss 

H 



fi . 



§3 

39 



P 



§5 

§ 



8 

- d 



268 




270 




271 



WO 03/100030 




it 
ll 



PCT/US03/06196 



272 




273 




274 




275 



WO 03/100030 





S3 ; 

si 



PCT/US03/06196 



276 




t 



219 



WO 03/100030 PCT/US03/06196 



88 

SB 



IS 

II 



28 



si 



!§ 



m 1 
Si 



280 




281 




282 




283 



PCTYUS03/06196 

WO 03/100030 




284 




286 




287 



WO 03/100030 PCT/US03/06196 



P 

if 



f! 



91 

§§ 

!! 



m 

§5 

il 

k a 



f! 

!! 



?! 

i! 



If 



288 



WO 03/100030 PCT/US03/06196 




■ i 

289 



WO 03/100030 PCT/US03/06196 




290 



WO 03/100030 



PCT/US03/06196 



iiiiiliiiiiiiiiiililiiliiiiiiilliiillliiliillllilliliilliiiiiiiiiniiiilii 

itiiiiiiiiiiiiiiiilllliiiilfilliillllllllililiifilifllf^liilililililflllll 

!§iiiiiiiiiiiiiil|ililiiiiii!lilliltiiiiiiiiiiiiiiiii!ill§llil!iliiiil!ill 

iiiiiililiiiililllillliiliIl!lillI11i!ltSiiii!Ili!llil!li!!lllilfilllli¥ll 

iiHiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii^ 

iaiiiiiiiiiiiiifiliiiiitiiiiiliiiliilillliiiliiilillliliiiitlliillliiillli 



291 



WO 03/100030 



PCT/US03/06196 



f! 



292 



WO 03/100030 PCT/US03/06196 



What is claimed is: 

[d] A method of predicting the kidney toxicity in an individual to an agent, 
comprising the steps of: 

obtaining a biological sample from an individual treated with the agent; 

measuring the expression of one or more kidney toxicity predictive genes 
in the sample, wherein the genes are selected from the group consisting 
of the genes corresponding to the partial gene sequences in Table 32, 
thereby generating a test expression profile; and 

using the test expression profile with a set of reference expression profiles 
in a Predictive Model to determine whether the agent will induce kidney 
toxicity in the individual. 

[c2] The method according to claim 1 , wherein the expression of the kidney 
toxicity predictive gene is measured at the RNA level. 

[c3] The method according to claim 1 , wherein the expression of the kidney 
toxicity predictive gene is measured at the protein level. 

[c4] The method according to claim 2, wherein the genes corresponding to the 
partial gene sequences are members of 24 hour Combo All. 

[c5] The method according to claim 2, wherein the genes corresponding to the 
partial gene sequences are members of 24 hour Combo 6. 

[c6] The method according to claim 2, wherein the genes corresponding to the 
partial gene sequences are members of 24 hour Combo 5. 
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[c7] The method according to claim 2, wherein the genes corresponding to the 
partial gene sequences are members of 24 hour Combo 4. 

[c8] The method according to claim 2, wherein the genes corresponding to the 
partial gene sequences are members of 24 hour Combo 3. 

[c9] The method according to claim 2, wherein the genes corresponding to the 
partial gene sequences are members of 24 hour Combo 2. 

[d 0] The method according to claim 2, wherein the genes corresponding to the 
partial gene sequences are members of 24 hour Combo 1. 

[d 1 ] The method according to claim 3, wherein the genes corresponding to the 
partial gene sequences are members of 24 hour Combo All. 

[c12] The method according to claim 3, wherein the genes corresponding to the 
partial gene sequences are members of 24 hour Combo 6. 

[d 3] The method according to claim 3, wherein the genes corresponding to the 
partial gene sequences are members of 24 hour Combo 5. 

[c14] The method according to any of the preceding claims 1-13, wherein the 
expression of at least one gene is measured. 

[c15] The method according to any of the preceding claims 1-13, wherein the 
expression of at least five genes is measured. 

[c16] The method according to any of the preceding claims 1-13, wherein the 
expression of at least ten genes is measured. 

[c17] The method according to any of the preceding claims 1-13, wherein the 
expression of at least fifteen genes is measured. 

[d 8] The method according to claim 2, wherein the genes corresponding to the 
partial gene sequences are members of 6 hour Combo All. 
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[d 9] The method according to claim 2, wherein the genes corresponding to the 
partial gene sequences are members of 6 hour Combo 6. 

[c20] The method according to claim 2, wherein the genes corresponding to the 
partial gene sequences are members of 6 hour Combo 4. 

[c21] The method according to claim 3, wherein the genes corresponding to the 
partial gene sequences are members of 6 hour Combo AIL 

[c22] The method according to claim 3, wherein the genes corresponding to the 
partial gene sequences are members of 6 hour Combo 6. 

[c23] The method according to claim 3, wherein the genes corresponding to the 
partial gene sequences are members of 6 hour Combo 4. 

[c24] The method according to any of the preceding claims 18-23, wherein the 
expression of at least one gene is measured. 

[c25] The method according to any of the preceding claims 1 8-23, wherein the 
expression of at least five genes is measured. 

[c26] The method according to any of the preceding claims 18-23, wherein the 
expression of at least ten genes is measured* 

[c27] The method according to any of the preceding claims 18-23, wherein the 
expression of at least fifteen genes is measured. 

[c28] The method according to claim 2, wherein the genes corresponding to the 
partial gene sequences are members of 72 hour Combo All. 

[c29] The method according to claim 2, wherein the genes corresponding to the 
partial gene sequences are members of 72 hour Combo 6. 

[c30] The method according to claim 2, wherein the genes corresponding to the 
partial gene sequences are members of 72 hour Combo 5. 
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[c31] The method according to claim 2, wherein the genes corresponding to the 
partial gene sequences are members of 72 hour Combo 4. 

[c32] The method according to claim 2, wherein the genes corresponding to the 
partial gene sequences are members of 72 hour Combo 3. 

[c33] The method according to claim 2, wherein the genes corresponding to the 
partial gene sequences are members of 72 hour Combo 1 . 

[c34] The method according to claim 3, wherein the genes corresponding to the 
partial gene sequences are members of 72 hour Combo AIL 

[c35] The method according to claim 3, wherein the genes corresponding to the 
partial gene sequences are members of 72 hour Combo 6. 

[c36] The method according to claim 3 f wherein the genes corresponding to the 
partial gene sequences are members of 72 hour Combo 4. 

[c37] The method according to any of the preceding claims 28-36, wherein at 
least one gene is used. 

[c38] The method according to any of the preceding claims 28-36, wherein at 
least five genes are used. 

[c39] The method according to any of the preceding claims 28-36, wherein at 
least ten genes are used. 

[c40] The method according to any of the preceding claims 28-36, wherein at 
least fifteen genes are used. 

[c41] The method according to any one of claims 1-13, 18-23, or 28-36, wherein 
the partial gene sequences correspond to rat genes. 

[c42] The method according to any one of claims 1 -1 3, 1 8-23, or 28-36, wherein 
the partial gene sequences correspond to dog genes. 
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[c43] The method according to any one of claims 1 -13, 1 8-23, or 28-36, wherein 
the partial gene sequences correspond to non-human primate genes. 

[c44] The method according to any one of claims 1 -13, 1 8-23, or 28-36, wherein 
the partial gene sequences correspond to human genes. 

[c45] The method according to claim 41 , wherein the agent is administered at 
different dose levels to determine the presence or absence of a no- 
observable effect level. 

[c46] The method according to claim 42, wherein the agent is administered at 
different dose levels to determine the presence or absence of a no- 
observable effect level. 

[c47] The method according to claim 43, wherein the agent is administered at 
different dose levels to determine the presence or absence of a no- 
observable effect level. 

[c48] The method according to claim 44, wherein the agent is administered at 
different dose levels to determine the presence or absence of a no- 
observable effect level. 

[c49] A method of predicting the kidney toxicity of an agent using an in vitro 
system, comprising the steps of: 

obtaining a biological sample from in vitro cultured cells or explants 
treated with the agent; 

measuring the expression of one or more kidney toxicity predictive genes 

in the sample, wherein the genes are selected from the group consisting 

of the genes corresponding to the partial gene sequences in Table 32, 

thereby generating a test expression profile; and 

using the test expression profile with a set of reference expression profiles 

in a Predictive Model to determine whether the agent will induce kidney 

toxicity. 
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[c50] The method according to claim 49, wherein the expression of the kidney 
toxicity predictive gene is measured at the RNA level. 

[c51] The method according to claim 49, wherein the expression of the kidney 
toxicity predictive gene is measured at the protein level. 

[c52] The method according to claim 50, wherein the genes corresponding to 
the partial gene sequences are members of 24 hour Combo All. 

[c53] The method according to claim 50, wherein the genes corresponding to 
the partial gene sequences are members of 24 hour Combo 6. 

[c54] The method according to claim 50, wherein the genes corresponding to 
the partial gene sequences are members of 24 hour Combo 5. 

[c55] The method according to claim 50, wherein the genes corresponding to 
the partial gene sequences are members of 24 hour Combo 4. 

[c56] The method according to claim 50, wherein the genes corresponding to 
the partial gene sequences are members of 24 hour Combo 3. 

[c57] The method according to claim 50, wherein the genes corresponding to 
the partial gene sequences are members of 24 hour Combo 2. 

[c58] The method according to claim 50, wherein the genes corresponding to 
the partial gene sequences are members of 24 hour Combo 1 . 

[c59] The method according to claim 51 , wherein the genes corresponding to 
the partial gene sequences are members of 24 hour Combo All. 

[c60] The method according to claim 51 , wherein the genes corresponding to 
the partial gene sequences are members of 24 hour Combo 6. 

[c61] The method according to claim 51 , wherein the genes corresponding to 
the partial gene sequences are members of 24 hour Combo 4. 
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[c62] The method according to any of the preceding claims 50-61 , wherein the 
expression of at least one gene is measured. 

[c63] The method according to any of the preceding claims 50-61 , wherein the 
expression of at least five genes is measured. 

[c64] The method according to any of the preceding claims 50-61 , wherein the 
expression of at least ten genes is measured. 

[c65] The method according to any of the preceding claims 50-61 , wherein the 
expression of at least fifteen genes is measured. 

[c66] The method according to claim 50 wherein the genes corresponding to the 
partial gene sequences are members of 6 hour Combo All. 

[c67] The method according to claim 50, wherein the genes corresponding to 
the partial gene sequences are members of 6 hour Combo 6. 

[c68] The method according to claim 50 wherein the genes corresponding to the 
partial gene sequences are members of 6 hour Combo 4. 

[c69] The method according to claim 51 , wherein the genes corresponding to 
the partial gene sequences are members of 6 hour Combo AIL 

[c70] The method according to claim 51 ( wherein the genes corresponding to 
the partial gene sequences are members of 6 hour Combo 6. 

[c71] The method according to claim 51 wherein the genes corresponding to the 
partial gene sequences are members of 6 hour Combo 4. 

[c72] The method according to any of the preceding claims 66-71 , wherein the 
expression of at least one gene is measured. 

[c73] The method according to any of the preceding claims 66-71 , wherein the 
expression of at least five genes is measured. 
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[c74] The method according to any of the preceding claims 66-71 , wherein the 
expression of at least ten genes is measured. 

[c75] The method according to any of the preceding claims 66-71 , wherein the 
expression of at least fifteen genes is measured. 

[c76] The method according to claim 50, wherein the genes corresponding to 
the partial gene sequences are members of 72 hour Combo All. 

[c77] The method according to claim 50, wherein the genes corresponding to 
the partial gene sequences are members of 72 hour Combo 6. 

[c78] The method according to claim 50, wherein the genes corresponding to 
the partial gene sequences are members of 72 hour Combo 5. 

[c79] The method according to claim 50, wherein the genes corresponding to 
the partial gene sequences are members of 72 hour Combo 4. 

[c80] The method according to claim 50, wherein the genes corresponding to 
the partial gene sequences are members of 72 hour Combo 3. 

[c81] The method according to claim 50, wherein the genes corresponding to 
the partial gene sequences are members of 72 hour Combo 1. 

[c82] The method according to claim 51 , wherein the genes corresponding to 
the partial gene sequences are members of 72 hour Combo All. 

[c83] The method according to claim 51 , wherein the genes corresponding to 
the partial gene sequences are members of 72 hour Combo 6. 

[c84] The method according to claim 51 , wherein the genes corresponding to 
the partial gene sequences are members of 72 hour Combo 4. 

[c85] The method according to any of the preceding claims 76-84, wherein the 
expression of at least one gene is measured. 
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[c86] The method according to any of the preceding claims 76-84, wherein the 
expression of at least five genes is measured. 

[c87] The method according to any of the preceding claims 76-84, wherein the 
expression of at least ten genes is measured. 

[c88] The method according to any of the preceding claims 76-84, wherein the 
expression of at least fifteen genes is measured. 

[c89] The method according to any one of claims 50-61 , 66-71 , or 76-84, 
wherein the partial gene sequences correspond to rat genes. 

[c90] The method according to any one of claims 50-61 , 66-71 , or 76-84, 
wherein the partial gene sequences correspond to dog genes 

[c91 ] The method according to any one of claims 50-61 , 66-71 , or 76-84, 
wherein the partial gene sequences correspond to non-human primate 
genes. 

[c92] The method according to any one of claims 50-61 , 66-71 , or 76-84, 
wherein the partial gene sequences correspond to human genes. 

[c93] The method according to claim 89, wherein the agent is administered at 
different dose levels to determine the presence or absence of a no- 
observable effect level. 

[c94] The method according to claim 90, wherein the agent is administered at 
different dose levels to determine the presence or absence of a no- 
observable effect level. 

[c95] The method according to claim 91 , wherein the agent is administered at 
different dose levels to determine the presence or absence of a no- 
observable effect level. 

[c96] The method according to claim 92, wherein the agent is administered at 
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different dose levels to determine the presence or absence of a no- 
observable effect level. 

[c97] A computer program product for predicting kidney toxicity from a test 
sample expression profile, comprising: 
an encrypted training data set; 

encrypted lists of genes selected from the group consisting of the genes 
corresponding to the partial gene sequences in Table 32, to be used with 
the training set, and 

a Predictive Model that uses said training set, said lists of genes, and said 
test sample expression profile to predict the kidney toxicity of the test 
sample. 

[c98] The computer program product of claim 97, wherein the encrypted lists of 
genes comprise the 24 hour Combo 6, 24 hour Combo 5, 24 hour Combo 
4, 24 hour Combo3, 24 hour Combo 2, and 24 hour Combo 1 gene lists. 

[c99] The computer program product of claim 97, wherein the encrypted lists of 
genes comprise the 6 hour Combo 6, 6 hour Combo 5, 6 hour Combo 4, 6 
hour Combo 3, 6 hour Combo 2, and 6 hour Combo 1 gene lists. 

[dOO] The computer program product of claim 97, wherein the encrypted lists of 
genes comprise the 72 hour Combo 6, 72 hour Combo 5, 72 hour Combo 
4, 72 hour Combo 3, hour Combo 2, and 72 hour Combo 1 gene lists. 

fc101] The computer program product of claim 97, wherein the prediction is 
made through the calculation of a certitude score. 

Fc102] A method for mining genes predictive for kidney toxicity, comprising the 
steps of: 

collecting expression levels of a plurality of candidate toxicity predictive 
genes among a multiplicity of samples; 
defining a group of samples to be a training set; 
defining another group of samples to be a test set; 
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optionally generating additional training and test sets; and 

selecting a set of genes which are predictive of kidney toxicity based on 

evaluating the training and test sets in a Predictive Model. 

[c103] The method according to claim 102, wherein the expression levels are 
stored as a database on an electronic medium. 

[c104] An integrated system for predicting kidney toxicity, comprising: 

means for measuring gene expression profiles of kidney predictive genes 
from biological samples exposed to the test agent; and 
a computer system operably linked to said means that is capable of 
implementing a predictive model. 
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Discovery of Predictive Genes for Kidney Toxicity 



Kidney Database 
Kidney samples— rats treated with 45 cpds 
Rat CT Expression array data for samples 
Pathology data (72h samples) 



Classification of kidney toxicity 
("yes" or "no" for each sample 



Pathology scores (semiquant.) 



correlation analyses 



Lists of genes correlating with 
histopathology scores 



Assignment of cpds/sample 
array data into 6 different 
training/test sets 

Training Set 1 Set 2 Set 6 

Test Set 1 Set 2.... .Set 6 



Predictive Model 
(predict kidney toxicity classification) 



Vary number of genes used in prediction 
Obtain optimum gene list (lowest number of 
genes with highest accuracy) for each input 
gene list and training/test set 



Merge optimum predictive gene lists for each training/test set 
Train/Test 1 List Train/Test 2 List.... Train/Test 6 List 



Merge All Train/Test Lists Into Combined List of Predictive Genes 

(Combo All) 

Sort Genes into Combinations by Number of Occurrences on 
Individual Training/Test Lists 
(Combo 6, Combo 5,... Combo 1) 



Figure 1 
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Overall Percent Correct Calls vs. Number or Predictor Genes 
Test and Training Set A-HistoCorrelatlng Genes (Pearson) 
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Figure 2 
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Evaluation of Predictive Genes for Kidney Toxicity 



Evaluated Gene Lists 
Combo All and Combo Sets 
Individ, genes in best Combo sets 
Randomly selected subsets 
Cumulative genes in Combo sets 
Subsets of ''non-predictive" genes 



6 different training/test sets 
(same as for identification) 

Training Set 1 Set 2 Set 6 

Test Set 1 Set 2 Set 6 

Accurate and random 
classifications 



Predictive Model (KNN) 



Predictive Performance 

(means and ranges for 6 different training/test sets) 

Prediction Units— Sample, Cpd-Dose, Cpd 

Accuracy — proportion of correct classifications 

False positive — proportion of incorrect classifications for negative 
samples 

False negative — proportion of incorrect classifications for positive 
samples 

Geometric Mean — measure of predictive performance that considers 
proportion of pos. and neg. samples 

Comparison of accuracy for accurate and random classification 



Figure 3 
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Cumulative Percent Accuracy 
Combo 6 Gene List 
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Figure 4 
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Cumuative Percent Accuracy 
Combo 5 Gene List 
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Figure 5 
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Cumulative Percent Accuracy 
Combo 4 Gene List 
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Figure 6 
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3 meta groups 




Figure 8 
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Figure 8 (continued) 
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Animal 


Treatment 


Kidney 
Tox. 


IGFBP-1 
Diff. Expression 


144 


polyethylene glycol - 5 mLl/kg 


No 


1.22 


146 


polyethylene glycol - 5 raL/kg 


No 


1.16 


354 


LPS — 8mg/kg 


Yes 


18.13 


355 


LPS — 8 mg/kfi 


Yes 


5.14 


2234 


ketoconazole — 80 mg/kg 


No 


-1.04 


2236 


ketoconazole — 80 mg/kg 


No 


-1.07 


2354 


chloroform — 0.5 mL/kg 


Yes 


1.93 


2356 


chloroform — 0.5 mL/kg 


Yes 


8.86 



Figure 9 
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Table 32 Genes Predictive for Kidney Tubular Necrosis, Sequences, and Accession Numbers 





lumber 1 


Seauence 


i 


Accession b 




u 
a> 

N 

n 

1 

ro 
i 


D17615 | 


TGGNGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATT 

CGCCCTTCGCGGGATCCAAAAAGCAGCAGATGGCTCGAGAATACAGAGAGAAGATCGAGACGGAG 

CTGAGGGACATCTGCAACGACGTACTGTCTCTTTTGGAAAAGTTCTTGATCCCCAATC 

GCCAGAAAGCAAAGTCTTCTATTTGAAAATGAAGGGTGACTACTACCGCTACTTGGCTGAGGTTG 

CTGCTGGTG ATGAC AAGAAAGG AATTGTGG AC C AGTC AC AGCAAGC AT AC C AAGAAGC ATTTGAA 

ATCAGCAAAAAGGAGATGCAGCCGACACACCCCATCAGACTGGGTCTGGCCCTCAACTTCTCTGT 

GTTCTACTATGAGATCCTGAACTCCCCAGAGAAAGCCTGCTCTCTTGCAAAAACAGCTTOTGATG 

AAGCCATTGCTGAACTTGATACATTAAGTGAAGAGTCGTACAAAGACAGCACGCTAATAATGCAG 

TTACTGAGAG AC AACTTG AC ATTGTGG AC ATCGG ATAC C C AAGG AG ACGAAGC AG AAAAGC TTGG 

CCAAGGGCGAATTCCAGCACACTGGCGGGCCGNACTAGTGGATNCGAGCTCNGTACCCAGCTTTG 

ATGCATA 


25-DX 


U63315 | 


TCCGATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCG 
CCCTTACACACACAGTGGCCCCAAATGGTCAATGTACTAAGAATGAAGAGAGAAGGTCTTAGCCA 
TGCTATAGPrcTGAAAACAAGCCCATTTTACCCAACAGACTTAAC 

ACCTCTAAAGCAAAACTGCAGTGTTCCAAAGTCTGTGGTATTGATTCAAAACAGAAGTCCAGTAA 

CAAAATGAAAACTCAATAATGGGTTTAGTTGGGGCAAACACATTGCCTGTGTTCAT^ 

TATCATCCCTGTCCCCACGTGGACACTCCCACACACAGTAACTCTCACACACCTGGTAATTGGCA 

GTTGGAACTACAACAGAATCTGAAAATTCAAGGTAGAACTTTGCAAAAGAAAAATCTGTTGCATG 

TAGCAGGGCAATGGTTATGTGGTTATTGGCCAAATGTAAAATTTGAGAGCAATATACAGGACAGG 

AACAGGTGTGCGCTAAGGGCGAATTCCAGCACACTGGCGGCCGTTACTAGTGGATCCGAGCTCGG 

TACCAAGCTTGATGCATAGCTTGAGTATTCTATAGTGGCACCTAAATAGCTTGGCGTAATCATGG 

NCATAGCTGTTTCCTGTGTGAAATTGGTATTCCGCTCACAATTCCACACAACATC 

AGCATTAAGTGT 


2 5 -hydroxyvi tamin 
D3-1 alpha - 
hydroxylase 


AB001992 1 


GCGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGK3ATATCTGCAGAATTCG 
CCCTTAACTAACAGCCCCAGGCAGCCTGGGCAGGGATCCCCCACTGATCCTTCCATGCTTACAGT 
GTTCACTGACAGCTGTCTAAGCATCCATTGCAGCACAAACTAAGTGACTGTGCACCTGGTCTGCA 
CCTGGTCTGCACCTGGTTGCGTCTCTGCCTGACCATGTGAGCTCTTTGAGAAGAGTGATGACTAC 
TGGGCTTTTAGCTCTTTTCCTTTTTGGGACACAGTCTTGC 

CCCACAAGCCCTCACCTCACCTTCCCAAGTGTTGGGTTACGGACATTAGCTATGGCTTCCAGCTT 
TATTAGTCTTTCTATCTCCTGCCATGGTCTATCCCCGGCTATTTGATACTATATATTCTCAGATT 
~,t> * rnnwr*r> &rr 7vav~"TY7r ,r ra r* a anfifi&Tf; A rr ACTC ACC AGGCTCTACCCACCACTTTATCTTAAT 
CTTTTCTCTAGGAAAGTGAATCTCTCCTTGCCTTACAGCATTTTA 

GCTCTAAGGGCGAATTCCAGCACACTGGCGGCCGTTACTAGTGGATCCGAGCTCGGNACCAACTT 
GATGCATAAGCTTGAGTATTCTATAGGNCACCTAAATAGCT 


3 -beta - 
hydroxys teroid 
dehydrogenase 

(HSD3B1) 


1 AA923963 1 


NGCCAAGCTAAAATTAACCCTCACTAAAGGGAATAAGCTNGCGNCCCGCAANGOT 
TANTTTTTTNNOTTAAAGCCATTACAATATTT 

CAGGATGTGATTTAGGACTTGAAGAGGAACAGAAAAGTAATACTCAGCTCTAAGTGACAGGAAAT 

TGTCATTGCTGAAGCCTTTGGTCACAGCAGCTGAGGCACAACTACCTGTGTGTCTCTGGACAGGG 

GATTAGGGAAGAAAGCTTGTGGACTAGCAAGGCTTCCAGTGAAGTCATAAAGACATAGAGTTAGA 

GTCTGTGTCAAAAGAGGGCATCAGGACCTGGATTGTGCCTCTGTCCTAGCTGGAGGACCTGGTAA 

CACCCAGAACCACATCCTTGCCCCCTTTCTGTCACTGAGACTTTGTGTCCAGTGTC 

TGCTCCACTAGTGTCCCGATCCACTCCGAGGTTTTCTGCTTGGCTTCCTCCCAGCTGA 

CACATAGCCCAGATCTCTCTGAGCTTTCTTGTAGGAGAAAGTGAACTTGCTATTTC 

CCAAGTGGCAGTTAAAGGGTGGCCTATAGTTGTAAAATCCTCGTGCCGAATTC 

GCCAAATTCCCTATAGTGAGTAGTATTAAATTCGTAATCATGTCATAGAG 
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3 -hydroxyisobutyrate 
dehydrogenase 


J04628 1 


Sg^aS^CTACTC^ 

^^^GAC^CCTCT^ATGGACACTCTTAGGAACCAAACTCTGTCCTGAGCTTCC^TA 
^CTCMT^TAGAAGTCATGGGTTO 

^aatccctcgggatttt^^ 

^TTG^G^ACTCTGATCAGGATAlTTrATTTTACTCTGCTTTAC^ 
rAACAAGATGGAACCATGGTAAAAAAGGAAAAAAATCAGTTN 


IS 

(0 0 
rH U 
S >. 

U D 

0) 

e 

i 

m 


X56420 I 


^TTGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGC^AATTCGCCCT 
^B^^R^rTTCTCCGGCGACTTGCTGATGGAACAGAACTCCGTGGGCGCATTGTAGA 

^catactoSSccag^gacgaagctgcccac 

^^^Stgcggcmcttcgaaactccctccggaaaagcactgtcggccgttccctcaaggacc 

g^gmctctcta^tcgtccctccaaactgtgccaggccctagccattgataaga 

toagacctcncccaagatgaggctgggtggctggagcatggccctctggagtc 

tctcg^«£ag^c^ataggta^^ 

mt^ag^cactgncgg^gotactaatggatccaactcggnaccaancttgatgcatancttg 

mATTCTATATGGCACCTAAAAACTTGGNGAATATGGGATACT , 


60S ribosomal 
protein L.6 


X87107 | 


ANCGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCGCCCT 

Stcctcagccacggcaagaagcccttcagccagcacgtgaggaggctgcgc 
c?gSactcStca^^tcactgggcgccgcaggggcaagagagt<mctttcctcaa^ 

acctcactcat^acctcaagaagaagccacttcgcaagcccaggcatcaggagg^ 
tocgacacagagaaggagaaatacgaaattacagagcagcgaaaggctgat^^ 

ANGGCGAATTCAGCACACTTGGCGGGCGGTACTAGTGGGATNCCN ivr^jv AfT^p 


60S ribosomal 
protein L6 
{alternate clone 1) 


X87107 1 


CTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAA'IT 
CGGCAGGTATT^CGATCTTCCTATGTATTCC^ 

^T^GGACAMAACGGTGGCACCCGGGTGGTGAAGCTTCGAAAAATGCCTAGGTATTACCCTAC 

TOAAGACGTG^CTCGGAAGCTCCTCAGCCACGGCAAGAAGCCCTTCAGCCAGCAC^ 

^^^^GCATCACTCCCGGGACTGTCCTGATCATCCTCACTGGGCGCCACAGGGGCAAGAGA 


Acetylcholine receptor 
epsilon 


f- 

3 


TATGACATGATTACGAATTTAATACGACTCACTATAGGGAA1 i 1°^^ AVte4Wa ^*-"™"^±i« 
^ScGAGGG^ACA^GGCATCGGCATGGAACTTGGACCGCAGCCGCCCTCTGCCAGAACCTGA 

GAMAGGAAGCCACTGGAGAGGAACTGTCTGACTG 

ctgtttttcggcagcgttSgtcctcttcagcgtcggttct^ 

TCAA^G^CTGATCTCCCCTACCCACCGTGCATCCAACCATCAGCCTGCACCAGGA^ 
CTCATCCCCACCCCCCAAGAAAGAGATTTTGAAAACAGGCTGCTGACAATAA^ 

ttt^TCAGGGTTO 

cctocacccgatccccttccaacagttgcgcancctgaatggcgaatg 

CNCA ' 
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8 

•H C 

*J o 

> « 



AGCTATGNCCATGATTACGCCAAGCTATTTAGGTGCCACTATAGAATACTCAAGTATGCATCAAG 
TTGGTACCGAGCTCGGATCCACTAGTAACGGCCGCAGTGTGCTGGAATTCGCCCTTTGCGGCTGA 
CAACATCCCTCCTAGGGAAGATGGAGTGAGAACATTCATCATTGAAGTTGTCCAATGGCCAGGGT 
ATGCTTTCTAGAAACTATGCTGTTCTGTCCTAGACTGACTGTGCATAGGGCATTCGTTTCTGAGC 
CTGGTGTTGTGCTATTTAGATGTTTGTCTTGCACAACATTGGCGTC 

ATCAGACCTGATTTCCGAGAGTTTGGGGGTCTGCCACTGTGGACAATATCCCCCAAAAGTGTTTC 
GGTGGCCATGTAAACTGGCTGATGACCAGCTGTGCTACTCTGTGCTGACCGAGGACTGATGCCTC 
CTTCCCCTGTACCCACTGCTGAGGAAGAACCCGGGCACAGCAGCTGTCCTTGGCTACAAACTGTT 
ACAATGTCACAGAACGAAGGCACAAAGTCCCGCTTTCAAAGGGCGTAGGACTCCACACTCAGTGA 
CAGGGCAGGAAGAGCCAAGGGCGAATTCTGCAGATATCCATCACACTGGCGGCCGCTCGAGCATG 
CATCTAGAGGGCCCAATTCGCAA 



TGNGGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTCTGATGGATATCTGCAGAATT 
CGCCCTTCGCGGGATCCACACGCAGGAATGGCAATGCTCTGTGAAACGATAGAAGAATGCTGGGA 
TCATGATGCAGAAGCCAGGTTATCAGCTGGATGTGTAGGTGAAAGAATTACTCAGATGCAAAGAC 
TAACAAATATAATTACTACAGAGGACATTGTAACAGTGGTCACAATGGTGACAAATGTTGACTTT 
CCTCCCAAAGAATCTAGTCTATGATGGTTGCACCATCTGTCCACACTGAGAATCGGGACTCTGAA 
CTGGAGCTGCTAAGCTAAGGAAACTGCTTAGTTTATTTTCTGTGTG 

GGGACACGTATGCAAGCAGCCCCTTGTGGAAAGCATGGATTGGGAGACTTCCTGCAGCGTCTGCA 
ACACGGATATGAAGGGGGTCTAAGGGGAAACTGCGAACTGTAAAGAACTTCTGAAAACTTACACG 
AAGAATGTGGCCCTCTCCAAATCAAGGATCTTTTGGACCTGGCTAATCAAGTAAGCTTGGCCAAG 
GGCGAATTCCAGCACACTGGCGGCCGTTACTAGTGGATCCGAGCTCGGTACCAAGCTTGATGCAT 

AGCTTGAGTATTCTATAGTGCACCTA 



&I1 



TTAAGATTTTCCAGAAAGTTTTTAGTTTTAATAATGGGAGAAAATC 

TTTTGCAATACAAGGGAAATATAATCAAAATTATGAAACAGCTCAGCACAGAAATGCTCTCTC 

CACAGTGTGATGGGTCCAATCCGCCACATTCCTCAGTGTCTGTGTAGAATCCTGAGTTCTTTC 

GACAGGCTACCTTTCTTGTCACTTGAATGAGCATTAGAACTCCAGGGTATTTCTCCATCTCAAGT 

GATCAAGGAGCAAATTAACACAATTCGGGGCCAAGGACTGACCACTCACTGTCCAAAACCAAACC 

TATCTAGAAAGGGTCACTGTGGAACTCAACAGAGGCACCCTCCGCCGTGGGACCCCATTGCACAG 

ATCTGTGACACGCGTCTTTTATCTCATTTCTGTGGGAAGACAAGTAA^ 

CAGCAACAGTGCTCTGAGCGTAGTTACATGAGGGTGAAGCATCGTTCGATAGTAATTTCTGTTAA 
TTTTTATACTTTTCAATGTGCTCACGAGCTATGATCAGCCTCTGAATTTC 

AATCTGATCCTCGTGCCGAATTCTTGGCCTCGAGGGCCAAATTCCCTATAGTGAGTCGTATTAAA 
TTCGTAATCATGTCATAGA 



o 

to J. 



9 



GGTGACAGCCCTCCTCCGAGGAGCAGCAGTGGGCAGCAGTAGCAGCAGCAGGTCACGGTGGGGTT 
GGGAATGTTAGTCTGGCCAGCTGGTGCTTATCCCTCCGGCTGGTGTCGGATGAGGACAGGGACCT 
AGACAGTGAGCTGGGACACTGAAGAAGCTCAGTCGGCAGTAAAGACACGACTGGAAATGTCATCG 
AGGAGCCAGTCGATGCCAGGCAGCAGGTCCTCCCCTGTGACGGCACTGCAGCCTTGGATGCGCCA 
GTGGTGGCTGCGGATGGAGTC C AGCTCTAGGG CCTC CTGAATAGC ATTAC AGG AC AGTGC TCC AG 
GCAGGTCCTGCTTGTTGGCAAAGATGAGGAGGGTCGCTCCAGCCAGGCGCTCCTCCACCAGTAGA 
CTCTGCAGCTCTCGCTGACAGTCCTGCATGCGCTGGCGGTCAGCGCTGTCCACCACCCAGATGAG 
GCCATCTGTGCTCTCGAAGTAGTTCCTCCAGTAGGAGCGCAGAGACTTCTGGCCACCCACATCCC 
AGATGTTCAGCTTGAATCCCCGGTGCTCCAGGGTCTTGCCTCGTGCCGAATTCTTGGCCTCGATO 
GCCAAATTCCCTATAGTGAGTCGTATTAAATTCGTAATCATGTCATAG 



i 



ATTIHX^CATGATTACG AATTTAATACGACTC ACT ATAGGGAATTTGGCC CTC GAGGCC AAGAAT 
TCGGCACGAGGCCTGCGCTCAGCACTAAGGAGTCACTGTTAGCTGTTCTGCCGGCGGGTTGCTCT 
TCTCAGCCATGGCTCCTCGCTGCTGGCGCTGGTGGAGCTGGTCCGCGTGGCCTGGGG 
CTTCCCTCCAGGAGCACTCCGACCCCTGGCTTCTGCAAGAAGTTCTCCACACAGGAGACAACCCC 
TCAGATCTGTGTG^TCGGCAGTGGCCCAGCTGGCTTCTACACAGCCCAACACTTGTTGAAG 
CACCCGGGCCCACGTAGACATCTATGAGAAGCAGCTCGTCCCCTTCGGCCTGGTGCGCTTTGGTG 
TGGCACCTGACCATCCTGAAGTAAAGAATGTCATCAACACATTTACACAGACAGCCCGCTCAAAC 
CGCTGTGCCTTCCGGGGCAATGTGGTGGTGGGCAGGGACGTGTCCGTTCCAGAGCTTCGGGAAGC 
CTAC CATGCTGTGGTTCTGAGTTATGGAGCCGAGGACC ACC AACCCCTGGAAATC CTGGCGAGGA 
GCTGCCTGGAGTGGTCTCAGCCGGGCCTTTGTGGGCTTGTACAATGGGCTTTCTC 
ANCTGGCTNCGGATCTGAGCTGGTGACACGGCTGTGATTTTGGGGCAAGGGAATC 

ATGTGGGCCCGGANC _______ 
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Alanine 
aminotransferase 


D103S4 J 


kk:gaattgggccctctagatg(^tgctcgagcggccgccagtgtgatgga 

rTCTAAACATGGATGCTGAGGTGCAGAAACAGATGGGGAAGCTGATGAGTGTGCGGCTGTCT 

^g^caggccaggccttcatggacatggt^ 

: ^G^GmCAAGCAGAGAGACAGGAGGTGCTGGCTGAAC^ 
^AGGTCTC^ 

CTC^GTCC A^TGC C CTTG AAAGC GGTGC AGCGTGCTC AGGAACTGGGCCTGGC ^CCT^C AT 
3TTCTTCTCCCTCTCCCTCCTGGAAGAGACTGGCATCT 

^nTTCGCCAANGGCGAATTCAACACACTGCGGN^ 

ITGATGCATAGCTTGAGTATTCTATAGNGN — . 


Alcohol 
dehydrogenase 1 


r* 

<N 

tn 
m 


TAGGTGACACTATAGAATACTCAAGCTATGCATCAAGCTTGGACCGAGCTCGGATCCACTAGTAC 
CGCCCGCCAGTGTGCTGGAATNCCCCTTTCGCGGGGATCCAGTCGCCAAGGTGACCCCAGGCTCC 
ACCTGTCCCGTCTTTGGCCTGGGAGGTGTTGGTCTGTCTG 

W^AGCCAAGATCATTGCCGTGGACATCAACAAAGACAAGTTTGCGAAGGCCAAAGAGTTAGGTG 
CCACTGACTGTATCAACCCTCAAGACTACACCAAACCCATCCAGGAAGTTCTCAGGAGATGACTG 
ATOTAGGGGTGG ACTTTTC ATTTCAAGTC ATTGGCCGTCTTG ATAC C ATGACTTCTGCCC TGTTA 
AG^TGC^ATTCAGCAT^CGGTCTAAGCGTCATTGTCGGGGTGC 
CGTTAACCCCATGTCGCT^^ 

GTAAA^ATCCCGTCCCCAAACTTGTCGCTGACTTCATGGCTAAGAAGTTTC 
AT^TCATGAAGCTTGGCCAAGGGCGAATTCTGCAGATATCCATCACACTGGCGGCCGCTCGAG 


Aldehyde 
dehydrogenase 1 


M23995 | 


AGAATACTCAAGCTATGCATCAAGTTGGTNCNGAGCTCGGATCCACTTAGTAACGGCCGCCAGTG 
TCTCGGAATTCGCCCTTCGCGGGATCCTGTTAGGAGGAGTGTGGAGCGGGC 
TAGGAGATCCTCTGGACTCAGGAATAAGTCAAGGTCCTCAGATTGACAAGGAGCAACATCCTAAA 
AT^TTCATCTCATTCAGAGTGGGAAGAAAGAAGGCGCCAAACTGGAGTGTGGTGGAGGACGCl^ 

gggSa^ggcttc^tccagcctacagtcttctccaatgtgaccgatgagatgcgcattc 

C^wIaGAGGAGATATTTGGACCAGTGCAACAA^ 

AAGAGAGCCAACAATACTCCCTATGGTCTAGCAGCAGGAGTCTTCACAAAAGACOT^ACAGGG^ 

CATCACTGTGTCTTCTGCTCTGCAGGCCGGGACAGTGTGGGTGAATTGTTATTTGACTCTTTC 

TCCAGTCCcESGGTGGGTTCAAGATGTCrWAAATGGGCGAGAAA 

rT^CCAAGGGCGAATTCTGCAGATATCCATCACACTGGCGGCCGCTCGAGCATGCATCTAGAG 


Aldehyde 
dehydrogenase 2 


X14977 1 


gV^gaattggccctctagatgcatgctcgagcggccgccagtgtgatggatatctg^gaattcgc 

CCTTACAAGAAGGGGCGAAGCTGCTGTGCGGTGGGGGCGCCGCCGCAGACCGTGGTTACTTCATC 

CAGCCCACCGTGTTCGGAGACGTCAAAGATGGCATGACCATCGCCAAGGAGGAGATCTTCGGACC 

AGTCATGCAGATCCTCAAATTCAAGACCATTGAGGAGGCTGTGGG^ 

A^GGGCTGGCTGCCGCTGTCTTCACAAAGGACCTGGACAAGGCCAATTACCTGT^ 

CAGGGTGGGACTGTGTGGATCAACTGCTACGATGTGTTTGGGGCCCAGTCCCCATTTGGTGGCT^ 

TAAGATGTCGGGGAGCGGCAGGGAGCTGGGCGAGTATGGCCTGCAGGCCTACA 

CGGTCACCGTCAAAGTGCCACAGAAGAACTCGTAAAGTGGCGTGCAGGCTTCCTCAGCCA^ 

CAAAAACCCAACAAGATCCTGAGAAAAGCCACCACCAAGCACACTGCAAGGGCGAATTCCAGCAC 

ACTGGCGGCCGTTACT.M 


Aldehyde 
dehydrogenase , 
microsomal 


I AA956846 1 


gccaatgantcgttcttgatacot<:atacaatacanttcagacaggttcaacgggt^ 

TANTCAGCAAACATGGTTGTACATACGCAAGTGACATTCTCATTGAGTGGGCAGTCTTAA 

ATCCTGGGAAGGCAATCAGACCTCTGCAGCTTGGAGCTTCAGGCCTAACCAGGCC^ 

GGCCACACAGATCTCTCTGCCACATCCCCGTGTAGCCATGTTAGGGTCTGACAGAGAATGCTGCC 

r-TCC AACAACCCTGCCCCATGAGTTCTGAAGCCTCATGTCTAATAACCCCCCTTTCACTACAGGA 

ACAGCTCCATGCTAAGATCTGTAGGAATCTCTCTAAGATTTCAAAAGTAAATTCTT^TCCTG 

TCCTGTGAAAATAAGCAGATTTATCCTGAAGACCTATATAACATTCAAATCGAGGG 

GATTGTTATGGAGGAATAATAATGAGAGGGTCCCATTATGTTGCCTAGGTTGG 

GGGCTCCGGAATCTTCTGCCCTCGGGCTCTGGCCTCGTGCCGAATTCTTGGCCTCGAGGGCCAAA 
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Alpha- 1 acid 
glycoprotein 


AI029162 J 


^TC^^^A^GArmCACTGCTCCTGCACACrTGGATAAGGTCCCATTCTCTCTCTG 
3ACTCCTAGATGGGTGAAGTTATAGACACACTGGTCGTCTGTGGTCTGAAACTCCCGAAGTTCAA 
rTCTCTCG^TATCAAGTTGGGGGTAAGGTAAAAATATTCCGTCTGTATCGTTTGAACTGCCTCC 
rTCAACACGGGGTCTCGGAAAGCTGCTCCCATGTAAAACCATTTGTC 

T^ATTGGTAATAGGTATGCCTAGGGTGATGTTGGCAGGTTCTGGGTTCTGAGCTTCCAACAAGG 
3CAGGAGGCTCAAAACGACAAGAACCATGTGCAGCGCCATGCCGAAGACCCTCGTGCCGAATTCT 
rGGCCTCGAGGGCCAAATTCCCTATAGTGAGTCGTATTAAATTCGTAATCATGTCATA 


Alpha -1 
microglobulin/bikunin 
precursor (Ambp) 


AIQ43784 1 


&GCNNCNCCNCCCCTTGCACGCCATGOTITCCTTC 

AGCTTCCNTTCCTNTCCAATACCCTTATTGTGACAATGAGATGCTAACACACAGAATGCTAGGAA 

^A^AG^^A^GGTCCAGATCATCTAAGCTGGACTCCACAGTGACGGACCGTCGCTA 

CCCTCTGGCT^CAGACCGGCTCCTTCAACTGCGTGTTAGCTCCTCGTACCCATCACCAGG 

CCACAGTACTCCTTGCACTCCTTCTCAGAGTAGAACTTGTTGCCGTTGCCTTTGCAGCCCCCATA 

GATC^TTGGATCCACTTCCCTTGCGCTGCATCAAATC 

GAGGCGAAGTTGTTACCArrGCCTAGGCAGCCGCCATACTGGAAGGTCTCGCAGGCCA^AGGC 

AGTC^CTTTCTTGAGGGTCCCAGTTATGAGTGGCTCACCCTCGTGCCGAATTCTTGGCCTCGAG 
GGCCAAATTCCCTATAGTGAGTCGTATTAAATTCGTAATCATGTCATA 


alpha - 1,2- 
fucosyltransf erase 


AB015637 | 


CGAATTGGGCCCTCTAGATCTCATGCTCGAGCGGCCGCCAGTGTGATGGATC 
CC^CGCGGGATCCTTGGCCCCAGAGAAAOTTCAAAGACTTTTATTTAGGGTGGGCTAAAAGTGG 

^'^GAGGMAGAACCGCTGT^ 
AAAA^^AGA^C^rmTAA^ 

TGAATCTAGGATCGAGAAGGTTTCTTCTTCTGAAGATGCTATAAGGGATCTGGGATCTGTGTO 

ggagg^cIaggcctccagaaaagaagggtatacccctgggctcacaa^ 

^AGAAGGTCAAGGGTGAAGGATCAGAAGGTGTTCCAGAACTGTGAGCTCAGCCTAAG^CTA 
TAAAACACGGGCTGGAAGCAAAGATGGGACATTTCMCTGACATCAGGTATACTTGATGCTCT 
AAGAACGGAGAAAGTTOCCTCAGATAGCATTTAACCTAATTGTATTCCCTTCCAAGA^ 
^T^CCAAG^GAATTNCAGACACTGGCGGNCGTTCTAGTGGATCCGAGCTCGGACCAAGCT 


Alpha -2 - 
macroglobulin 


J02635 1 


GCGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCG 
CCCTTGCCGACTGTGCCTGGGGATTACACCGTGAAGGTGACAGGAGAAGGCTGTGTCTACCTCCA 
GACATCCTOGAAATACAGTGTTCTCCCGAGAGAGGAGGAGTTCCCCTTCGCTC 

ctSctcggacatctcaggatcccaaagctcac^ccagcttccagatctcactc^ 

TACACTCGAAGCCGTTCTGAATCCAACATGGCAATTGCTGA 
ATAACCATGTCTTGATTTACCTGGATAAGGTGTC 

S£Sa?a^aS^ 

AGATGAGTTTGCAGTTGCAAAATACAGCGCTCCCTGCAAGGGCGAATTCCAGCACAC 

g™ctagtggatccgagctcggtaccaagctogatgcatagcttgagtattctatagtgtcacc 

* ...I I, in ij-imnmn tl ti tl mrnr* 


Alpha-fibrinogen 


1 X86561 1 


TCTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTC^GGCC^GAAT 
TCG^ATGAG^TCAAAAACTCACTATTTGATTTTCAAAAGAACAA^ 

ACCAGGAATATCATGGAGTATTTGAGAGGGGACTTCGCTAACGCCAACAACTTTGATAACACTTT 
CGG^A^TGTCAGA^ACCTGAGGCGCAGAATTGAGATCCTAAAGCGCAAAGTCATAGAGAAAG 

CGCAACAGATTCAGGTTCTGCAGAAAGACGTCCGGGATCAGCTGATAGACATC 
GTG^CACTGATATCAAGATCCGCTCTTGCAAAGGATCCTGCAGCAGGTCTGTAAGCCGTGAGM 

AAATCTAAAGGACTACGAAGGTCAGCAAAAGCAACTTGAACAGGTCA 

C^GCAAAM^AGGCAGTACTTGCCAGCAATAAAAATGTCTCCAGTTCCC 

AGTTTTAAGAGCCAGCTTCAGGAGGGGCC^ 

GAGAA^AGC^AGACMCCGGGAAGGATCGGGCTTCGCGAGGAGATTTACCAGGAGATTCGCG 
AGGAGACTCTGCNACACGT 
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GGGAATTGGGCCCTCTAGATCCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCG 
CCCTTGCGCGGATCCTGATGTGGTCCCCAAAGATGTCAATGCTGCCATTGCCACCATCAAGACCA 
AGCGCAGCATCCAGTTTGTGGACTGGTGCCCCACTGGCTTCAAGGTTGGCATTAATTACCAGCCT 
CCCACTGTGGTACCTGGTGGCGACCTGGCCAAGGTCCAGAGAGCTGTGTGTATGCTGAGCAACAC 
CACAGCCATTGCTGAGGCTTGGGCTCGCCTGGATCACAAGTTTGATCTGATGTATGCCAAGCGTC 
CCTTTGTGCACTGGTACGTGGGTGAGGGAAGCTTGGCCAAGGGCGAATTCCAGCACACTGGCGGC 
CGTTACTAGTGGATCCGAGCTCGGTACCAAGCTTGATGCATAGCTTGAGTATTCTATAGTGTCAC 
CTAAATAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAAT 
TCCACACAACATACGAGCCGGAAGCATAAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAA 
CTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTNCAATCGGGAAACCTGCGTGCCAACTGCAT 

TAATGAATCGC 



GTGAATTGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCGC 
CCTTATCGCGGGATCCGGGGGACGGATGAAGAGAAGTTCATCACCATCCTTGGGACACGCAGTGT 
GTCTCATTTAAGAAGAGTGTTTGACAAGTACATGACAATATCAGGATTTCAGATTGAGGAAACCA 
TTGACCGAGAGACCTCAGGGAACTTGGAGAACTTACTCCTGGCTGTCGTGAAGTCTATTCGGAGC 
ATACCTGCCTACCTTGCAGAGACCCTCTACTATGCTATGAAGGGTGCTGGGACGGACGATCACAC 
CCTCATCAGAGTC^TAGTGTCGAGGAGTGAGATTGATCTGTTTAACATCAGGAAGGAGTOT 
AGAACTTCGCCACGTCCCTGTACTCTATGATCAAGGGCGACACATCTGGAGACTATAAGAAGGCC 
CTGCTGCTCCTCTGTGGAGGCGAGGATGACTGAGGAGCTGCCTGGAGTGCCCTGGGCCCGCCT^ 
TCCCCACCATCAGCTTCCTTCAGCACCACGCCTACTTACGTTCAATGCCTGCCTGAAGGGCGAAT 
TCCAGCACACTGGCGGCCGTTACTAGTGGATCCGAGCTCGGTACCAAGCTTGATGCATAGCTTGA 
GTATTCTATAGTGNCACCTAAATAGCTTGGCGTAAT 



u 
c 
o 
o 

U 

a 

a 

•H 

o 



GCAAACCGCTTCTCCCGGGGCGTTGGCCGATTCATTAATGCAGTGGCACGACAOT 

GAAAAGCGGGCAGTGAGCGCAACGCAATTAATGTGAGTTAGCTCACTCATTAGGCACCCCAGGCT 

TTACACTTTTATGCTTCCGGCTCGTATGTTGTGTGGAATTGTGAGCGGATACC 

GAAACAGCTATGACCATGATTACGTCCAAGCTATTTAGGTGACACTATAGAATACTCAAGCTATGC 

ATCAAGCTTGGTACCGAGCTCGGATCCACTAGTAACGGCCGCCAGTGTGCTGGAATTCGCCCTTG 

CGCGGATCCTCACGGCTCAAGAGTTGGTGTTGTTAGTTGGTCCTCAGGGCCAGACTCCCAGAGGC 

CAGTGAACTTATCAGTGAACTTGCTCCAGTAGCCTTTCAGGGATTTGAAGCGATTGTCCATCCAG 

CCCCTGGCCACCACAGCTATATCAGACTCCTGCATGCTGCTTAGTGCATCCTGGACCGTCTTGGA 

GGCTTGTTCCATGTAGCCCTGCATAGAGCCCAGCAGCAAGGATCCCTCTCCCTCATCAGGGAAGC 

TTGGCCAAGGGCGAATTCTGCAGATATCCATCACACTGGCGGCCGCTCGAGCATGCATCTAGAGG 

GCCCAATTC 



i « 

•s .a 



TCTATGAC ATGATTACGAATTTAATAC G ACTC AC TAT AG GG AATTTGGCC CTC GAGGCC AAGAAT 
TCGGCACGAGGGCTTCACCGCCAACAGCACGGCCATGGCTGGAGCTCTGGTGCGCAAAGCAGCGG 
ACTATGTCCGGAGCAAGGACTTCCGGGACTATCTCATGAGTACGCACTTCTGGGGCCCAGTTGCC 
AACTGGGGTCTCCCCATTGCTGCTATCAATGACATGAAGAAATCTCCAGAGATTATCAGTGGGCG 
GATGACTTTCGCCCTCTGTTGCTATTCTCTGACATTCATGAGATTTGCCTACAAGGTACAACCCC 
GAAACTGGCTTCTGTTTGCGTGCCATGTGACAAACGAAGTCGCTCAGCTCATTCAGGGAGGACGA 
CTTATCAACTACGAGATGAGTAAGCGGCCATCTGCATAGCAGTGCAAGGACCAGCTCTTGAAAGG 
GACAGTGCTCCAGCCACTGTTGCGGCCACAGATCACGTCAGCATGAATAGTCGTGCTGAGGGGAA 
AACACGGAAGACTATCTTTAATGACCATGCCAACATTATTGAATAGCCAAGAATCCCCAAACCAA 
CTCTCGGCTGCCTTATCAATGCTAAACTTTATTTTGTCTTCATCAGGAGTAGTTCAAAATATGCA 

GCTAATTTTATNATT 



a* 

S 



.5 

U 



CAACTGTTGGAAAGGGGGATCGGGGNGGNCCTNTTCGTTATTACGCCAACTGGCGAAAGGGGGAT 
GTGTGCAAGGCGATTAAGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGTTGTAAAACGACGG 
CCAGTGCCAAGNTAAAATTAACCCTCCCTAAAGGGAATAAGCTTGCGGCCGCGAGATTTTTTO 
TTTTANTTTTTTTTTNTTTTTTTTTl^ 

ATAT AAAAGGC AAC AGTAC ATTGTTG ATTGGGGG ATTAAAGGAAGG AAG AC CC TTTTAAAAAGTG 
GAGTTTCCCACCCCTATTCCTAAAACTGTAACATATATATCTATGTATATATTTATATATTTAAA 
AAAAACAAATTATGAAACTTACGACCAGGACTTTAGCCCCTCCCAATGTCTATCTTATAACAGAA 
ACATATTACATACACAGCCCCTAGTTGAGGATCACAGCTCCAGATGTGAGCTACGCCCCTCTTAT 
GCCTCCCCCGCTCCCTAAGCCTAGAAGTCCATTTCGAAAACTCAGTAGACACATACATACTTAGA 
AACGCACTCCTGGGTCCACCTCCACACTGGAGTCCCTGAAGCAAGTATACTGTGGCACTTTCCTG 
CCTCGTGCCGAATTCTTGGCCTCGAGGGCCAAATTCCCTATAGTGAGTCGTATTAAATTCGT 

CATGTCATA „ 
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IcTCTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAAj 
fccGGCACGAGGCTGGTCCACTAACTCCAGGGACTGTCWGACAGAGGAACAAACAAG^ 
IcGAGGCCTCGCTCAGTGGCGCGCTTGCTGAGTGCTC 

I^^TCTCTTCCTGCCCGGTAGACCACCCGAGGTGGTGAGTGTGGTCTTGTCTTCCAGATTC 



2 Icgatttgcaggctcggtcgaccccaccatggacaagttcaactcatctatcgcctatcaccggca 
o tc?^aatc^acctgcagggaagcaaggcctacagcaggggcctggagaagg^gggc^ 
tcacc^ctcagatgc^gcagatactgcaagg^ 

GGCATC^AAArrGTACCCTAATGATGAAGACATCCACACGGCCAACGAGCGGCGCCTGAAGGA 



Iactcattggtgaagctgcagggaagttacacacaggcagaagtcgcaa 



Is 
a 



tctatgacatgattacgaatttaatacgactcactata^ 
tcg^cacgagottcacaaaga<^gcgggcctccccgctctgcagctctc 
Iaattctogtgataaatttgtaattgtagcttgttctcctaccacctgactggggc 

ccccTCAecTCCCCCccA^^ 
Itcgaagggaaggggggxstggcaggcagctgc^ 

c» CAraGTC^ATTTCTC 

* I^^OTATTCCCTTOAGTGAGGGTTAATTITAGCTTGGCACTGGCCGTCGTOT 

IcGTAATAGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAATC 

Iggacncgccctctagcggcgcattaagcgcggcgggtctggtggttacc^ 
Iacttgccaagcgccctancc 



(0 u 

o o 

O iJ 

U O, 

•D 01 

>i u 

A a 



5 



IcccSgcg^cta^ctcagHa^tcaccatggctc 

mCANGAAC^TAAACAACTGACAAAGTCTTCTAAATTAAAAGCGTTAGGG 
TGCAGGGMCAAAGTTCTCTGTGTAANGCATACTGTCCACTNTGA 

Icantcccctanaaaccggaaaaaaagctat 



a 



tatgacatg att acgaatttaatacgactcact atagggaatttggccctcgaggccaagaattc 

E_ySa5___«T^ 

_5K25ot_ttt^^ 

TAAAATCACTTCTTGTGTTGCTGAAGAAC 

GAAATGAACTGACTCGAGTGTTTCTAGTTACTCACTGGCTAAAGAATGGCGCTGAAGTTCACA 
ifiCMGGCTCG^GTCAAGCCA^ 

CGTATGAftGTGAGAAGGGCTCAM I 
PlTCCAGGAATGACTTTTT^ 

GCTCTGTTTACCTCATCGAGCCATCCTTCCTCAANGGGGGGGGGGGGGG 



I,- 



A^TCG A^MGTCTA^A^C A^StTGGmSaTAACTTGC TATTCTTGTTGC TTAGACATTTTCTGT 
« o - « - l i i^AGTGTTTGCCAAAATTGAAATTTTTCAGGTGTTTTAC 

A § 2 3 § h ITCGCTTTCTCTCTGGATCTTACAGGGATAGATAGAA I 
* 8 S ' ' iTCTCCTTT^ATATTTTTACTCTGTATGTATAACATACATACCTA^ 

ACTACTATC^ 



2s § 



! 

o 

X 



rTCAAGTTATGCATCAAGTTTGGTACCGAGTTGGATCCACTAGTAACGGCCGCAGTGTGCTGAAT 

Itcgcccttcgcggaattcggggcctttttgt^ 

GGCTCGGGAGACACCTCAGCTGACCTTGGAGCAGCCGCCCCAGGACGCATCCACCAAGAAGCTGA 

gItctggatacagactccccccgagaggtcttcttccgtgtggcagctgacatgt^ 
caac^aactggggccgggtggttgcccttttctactttgctagcaaactgg 
^cacta^agtgcccgagctgatcagaaccatcatgggctggacactggacttc 
cggctcct^tggatccaagaccagggtggctgggatggcctcctttcctacttcggg 
cacatggcagacagtgaccatctttgtggctggagtcctcactgcctcgctcaccatctggaaga 
Iagcttggccaagggcgaattctgcagatatccatcacactggcggccgctcgagcatgcatctag I 

AGGGCCGN ■ 
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Igaactcttcagggatggggtgaactgi 



!TGGTGG AC AAC ATCGCTCTGTGG ATGACTG I 



ICATACTTGAGT 



GGGAATTTGGCCCTCGAGGCCTAGAATTCGGN 



-8 

?1 
is 



iGCTTCTGTGTAAATTATGTACTGCAAACAT 



|TTTTTAATTTCTGAATGGTCAGCCA r 
ATGAAGGCTTTGGTCTCCCTGGGAr 



TTTTTTAAATCTTCCGCCTTAATACTTCATTTTTG I 
^^CCTGCCCCITTTTTTTGTCCCCCC^ 

Icttctttgaggtgttgaggcagccagggctgkctctacac 

.GTGCACACCTTACAAACNAAAA AAAAAAAAAAAAAAAAA ^ 



lGTTCGCTCA* 



.GGTCCTTTTGGCCAGMOTTCAGACCGGACAACTTTG' 



TTTTTGGTCAGTCTGGGGC I 



GGAGGCACGGGCTCTGGCATGGGCACCC 
jcATCATGAACACCTTCAGCGI 



:tgctcatcagcaagattcgagaagaataccccgaccg I 



c 



OT^^TC^CGCrTNCTCGCTCACTTGACTCGCTGC 



agccggaagcataaagtgtaaagcc 

iTGCGCTCACTGCCCGCTTTCCAr 



CGCGCGGGGAGAGGCGGTTTGCGTA' 

TTCGGTCGTTCGGCTGCGG raAGCGGTATCAGCT, 



GGGGACGGGAGAATATCCTTTGAGGAG' 



lATGGTGGTAGA' 

lGAGCATGCATCTAGAGGGCCCAATTC 
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Calnexin 


L18889 1 


-TATAGAATACTCAAGCTTATGCATCAAGCTTGGTACCGAGCTCGGATCCACTAGTACCGGCCGC 

: AGTGTGCTGGAATTCGCCCTTCGCGGGATCCCCAGATTTCTTTGATGACCTC^ 

^TAACTCCTTTCAGCGCTATTGGTTTGGAGCTCTGGTCCATGACATCCGACATCTr^ 

^TTTATCATTAGTGGTGACCGAAGAGTAGTTGATGACTGGGCCAATGATGGGTGGGGCCTGAAGA 

AAGCTGCTGATGGGGCTGCAGAGCCAGGTGTAGTGGGGCAGATGCTGGAGGCAGCTGAAGAGCGT 

CCATGGCTTTGGGTGGTCTACATTCTGACTGTAGCGTTGCCAGTGTTCCTTGTGATCGTCTTCTC 

CTGTTCTGGAAAGAAACAGTCCAATGCTATGGAGTACGAGAAGACAGATGCTCCCCAGCCAGATG 

rGAAGGACGAAGAAGGGAAGGAAGAAGAGTAGAACAAGGGAGATGAAGAGGAAGAAGAGGAGAAG 

CTTGAAGAGAAACAGAAGAGTGATGCTGAAGAAGATGGTGGCACTGGCAGTCAAGATGAGGAGAT 

AGCCTCGAGGGCAAGGGCGAATTCTGCAGATATCCATCACACTGGCGGCCGCTCGAGCATGCATC 

TAGAGGGCCNATTC 


Calpactin I heavy 
chain 


in 
<y\ 

i 


ACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTCGGCA 
CGAGGCTAGGGAGGCTCTCTGCAATAGGTGCCCGGCCCAGCTTTTTTTTC 

CACGAAATCCTGTGCAAGCTCAGCTTGGAGGGTGATCATTCTACACCCCCAAGTGCCTATGGGTC 
GGTCAAACCCTACACCAACTTCGACGCTGAGAGGGATGCTTTGAACATTGAAACAGCAATCAAGA 
CCAAAGGCGTGGACGAGGTCACCATTGTCAACATTCTGACTAACCGCAGCAATGCACAGAGGCAG 
GACATTGCCTTCGCCTACCAGAGGAGGACCAAAAAGGAACTGCCATCGGCGATGAAGTCGGCCTT 
GTCTGGTCACCTGGAGACCGTGATGTTAGGCCTGTTGAAGACACCTGCTCAGTACGATGCCTCTG 
AGCTCAAAGCCTCCATGAAGGGCCTGGGGACTGATGAGGACTCCCTCATCGAGATCATCTGCTCA 
AGAACCAACCAGGAGCTGCAGGAGATTAACCGAGTGTATAAGGAAATGTACAAGACCGATCTGGA 
GAAGTGGCCCTTACCTGTGCCCCAACCTAATGTTCTAGAGAATCAGCCTGCCACTAATGGGACCC 
CTGAACTCCTCCTGGGAANATGACGACAGANCTTGCCN 


Calreticulin 


D78308 | 


TGGGGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATT 

CGCCCTTCGCGGGATCCTGATGACCCCACAGATTCCAAGCCTGAGGACTGGGACAAGCCAGAGCA 

CATCCCTGACCCTGATGCTAAGAAGCCTGAGGACTGGGACGAAGAGATGGATGGAGAGTGGGAAC 

CACCAGTGATTCAAAATCCTGAATACAAGGGCGAATGGAAGCCACGTCAAATTGACAACCCAGAT 

TACAAGGGTACCTTCATACACCCAGAGATTGACAATCCTGAATACTCCCCCGATGCGAATATCTA 

TGCCTATGATAGTTTTGCTGTACTGGGCTTAGACCTCCGGCAGGTCAAGTCTGGCACAA 

ACAACTTCCTCATCACCAATGATGAGGCCTATGCAGAGGAGTTTGGCAATGAGACCTGGGGTGTC 

ACCAAGGCTGCAGAGAAGCAGATGAAGGACAAGCAGGATGAGGAGCAGAGGCTTAAGGAAGAAGA 

AGAAGACAAGAAGCGTAAAGAGGAAGAGGAGGCCGAGGATAAAGAGGATGAGGAAGCTTGGCCAA 

GGGCGAATTCCAGCACACTGGCGGCCGTTACTAGTGGATCCGAGCTCGGTACCAAACTTGATGCA 

TAGCTTGAGTATTCTATAGTGTCACCTAAATAGCTTGGCGTA 


Canalicular 
multispecific 
organic anion 

transporter 


D86086 | 


NTGNCNATGATTACGCCAAGCTATTTAGGTGACACTATAGAATACTCAAGCTATGCATCAAAGCT 
TGGTACCGAGCTCGGATCCACTAGTAACGCCCGCCAGTGTGCTGGAATTCGCCCTTTAGATGTTG 
CCTCCATTGGACTGCACGACCTTCGAGAGAGGCTGACCATCATTCCCCAGGACCCCATTTTGTTC 
TCGGGGAGTCTGAGGATGAATCTCGACCCTTTCAACAAATATTCAGATGAGGAGGTTTGGAGGGC 
CCTGGAGTTGGCTCACCTCAGATCCTTTGTGTCTGGCCTACAGCTTGGGTTGTTATCCGAAGTGA 
CAGAGGGTGGTGACAACCTGAGCATAGGGCAGAGGCAGCTCCTATGCCTGGGCAGGGCTGTGCTT 
CGAAAATCCAAAATCCTGGTCCTGGATGAAGCCACGGCTGCAGTGGATCTCGAGACGGATAGCCT 
CATTCAGACGACCATCCGAAAGGAGTTCTCCCAGTGCACGGTCATCACCATCGCTCACAGGCTGC 
ACACCATCATGGACAGTGACAAGATAATGGTCCTAGACAACGGGAAGATTGTCGAGTATGGCAGT 
CCTGAAGAACTGCTGTCCAACAGAGGTTCCTTCTAAAGGGCGAATTCTGCAGATATCCATCACAC 


Carbamyl 
phosphate 
synthetase I 


M1233S 1 


CTCAACGCCAACAATGTTCCTGCCACCCCAGTGGCTTGCCATCTCAGGAAGGACAGAATCCCAGC 
CTCTCTTCCATCAGAAAGTTGATAAGAGACGGAAGCATTGACCTAGTGATTAACCTCCCCAATAA 
CAACACCAAATTTGTCCATGATAATTATGTGATTCGGAGGACAGCTGTGGACAGTGGAATTGCTC 
TGCTCACCAATTTCCAGGTGACCAAACTTTTTGCTGAGGCTGTGCAG 

TCCAAGAGTCTGTTCCACTACAGGGAGTACAGCGCTGGAAAGGCAGCATAGAGCAATATGCTGGT 
CTAGTGAATTCATCCTTCAGTCAGCAGGAGCCACACTGTACCAAAGGGACTGGTCCCAGCCTATC 
» ,mpr aav^TTY^prpnr'r a a arr T^aTTTTTGGTTTCCCTGTTCTTAGTAGTCTGTATCTT 
TAACCACTGAATTGTTGTCAGTCACTGTTTCAAAACCATCANGGTCTTCCCAAGTCTCTTGTC^ 
CACAAGGAATCCAACCATTCATACATTATCTTTTGTANACCGACTTGTACATTTCATGGG 


H 

H <N 
U H 
•H 0) 
C 0) U 
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A (0 V 
M V4 5 

fl) -p & 

10 


a\ 

O) 

CO 

o 
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CTAACCCAGAAGCATGAATTTCACACCTAACCTTTTTAATAACTACCTTT^ 

ATTTCTAGTATAAATTCAGTGAAGAAGAAATAGATGGAAAAATAACGACAGTGACAGTGGCTTAT 

TAAGCATATTAAGTTTCTCAGAATCTCAGCACTCCTGTTTTCAGGCTGAGTTACTGACAAGTGAT 

GGGTCCATGCTATGTATGTGGGTATGGAGGCATGTGCCCACCTTACCACATTTGATTGAAAGCAC 

AAGTTAAGATCACTGTAGATTTCAGAAGGTGAATACATAATGTTTACCTCAAATAATACCATCGC 

TACTCTATCACTTTTTAAAAATTTGCCTACTACCAACTTCGCCTGGTTTTAAATTAC^ 

ACACACCCATAAGGCCAGGTCTTTTAAATTTTTTGATCCAGC^ 

ATGGTTTTCTTTGCCCAATCTAGGGTCATTTATTTTTTATTTC 

TATTAAAAACAAAGCAAAACAAACAAACAAAAAAAAAaACCAAAAAAAAAAAAA 



9/73 



WO 03/100030 PCT/US03/06196 



u 



TGCGAATTGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCG 
CCCTTTGACCCCACCCCCTTCCACATTCAAGCAGAGGTGACAATGAAAA 

CCAAGATGTCTGCAAGGAGCTACTCCCTATAATAAAACCCCAAGGCAGAGTGGTGAATGTATCAA 
GCAGCGTGAGTCTCAGGGCCCTGAAAAGCTGCAGCCCGGAGCTGCAGCAGAAGTTTCGAAGTGAG 
ACCATCACTGAGGAAGAGCTGGTGGGGCTCATGAACAAGTTTGTAGAGGATGCAAAGAAAGGAGT 
CCATGCGAAAGAAGGCTGGCCCAATAGTGCATATGGGGTCACCAAGATAGGGGTGACAGTCCTGT 
CCAGAATCTATGCCAGGAAACTCACTGAGGAGAGGAGAGAGGACAAGATCCTCCTGAATGCCTGC 
TGCCCTGGGTGGGTCAGAACCGACATGGCAGGACCAAAAGCCACCAAAAGCCCAGAAGAAGGAGC 
AGAGACCCCCGTGTACTTCGCCCTTTTGCCTCCAGGTCCAGAGAAGGGCGAATTCCAGCACACTG 
GCGGCCGTTACTAGTGGATCCGAGCTCGGTACCAAGCTTGATGCATAGCTTGAGTATTCTATAGT 
GTCACCTAAATAGCTTGGCGTAATCATGGGCATAGCTGGTTNCTGTGTGAAAT 



* 



TTATCACATGATTACGAATTTGAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAA^ 
TCGGCACGAGGCCAGCTTCTCTCACCCTACTCTTGGGTTCAAGATCTTAGCAACCATGAAACTTC 
TTATCCTCACCTGCCTCGTGGm^CTGCTCTTGCTCTGCCTAGAGCTCATCGTAGAAATGCAGTC 
AGCAGTCAAACTCAGCAAGAGAATAGCAGCAGTGAGGAACAGGAAATTGTTAAACAACCAAAGTA 
TCTCAGTCTTAATGAGGAGTTCGTCAACAACCTGAACAGACAGAGAGAGCTTCTGACAGAACAGG 
ATAATGAAATCAAGATAACTATGGACTCATCAGCTGAGGAACAAGCAATGGCAAGTGCTCAGGAA 
GATTCCTCCTCAAGCAGCTCATCAAGCGAGGAATCCAAGGATGCTATTCCCAGTGCTACTGAGCA 
GAAAAAC ATTGC AAACAAAGAAATACTC AACC G ATGCAC CCTGGAAC AGCTTC AGAGACAG ATTA 
AATACAGCCAACTTCTCCAGCAAGCTTCACTGGCTCAGCAAGCTTCCCTGGCTCAGCAAGCTTCT 
TGGCCAGCAAGCTCTCCTGGCCCAGCAACCTTCCTGGCACAGCAAGCTGCCCTGGCACAGCAAGC 
TTCCCTGGCACAGCAAGCTTCCCTGGNACAGCAAGCTTCCCTGGCACNAGAAACATCATCCAAGN 

AC 



(0 
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GCCTCTTGACCGTGATGTTGATGAAGTAAGTCTTCTGTACCTCCTTAGGATGGCATCATC 
CCTTGTGCTTACTGCAGGTTGTAATGGCACGTTTCACTTGCTCCCTC 

GAGAGCAGGACTGTACTCCTTACTTATTGGCTATTCAGTACGGCACTTACTAGGTCTTCAATGAA 
TGTTTCCTGAGTAAAGGAAGGAGACAAGCAGCTAACTTTAGTTAGAGCCTATCTTTTGCG 

TAAATTGCTATTATAGTACTCAGTTTAATTAG^ 
GTTTTATTTTTATCCACTGTTTTGTTGTTTTTTTACATTGT^ 

AACTTCTTTCACATCTCCATGGTGCCCCGCAAATTTGAGGCCTATGGTAGTTGAGGTGCTC 
AATGTTTGTCGTATGAACCAGGTGGTTTGAAGACTTGCTGCCAAATTC 



ft o 
u * 



TTGGCGAATTGGGCC CTCT AG ATGCATGC TCGAGCGGC CGC C AGTGTG ATGGATATCTGC AG AAT 
TCGCCCTTCGCGGAATTCGATGCAGGATCTGCTTAGACGAGCCTCTGAAGAGGACCACAGCAACT 
CAGCCTGCTTCGCCTGCGTCCTGCTGAGCCACGGAGAAGAGAATCTGATTTATGGGAAAGACGGC 
GTGACACCGATAAAGGATCTGACAGCTCATTTTAGGGGAGACCGATGCAAAACCCTGCTAGAGAA 
ACCCAAGCTCTTCTTCATCCAGGCGTGCCGAGGGACGGAGCTGGATGACGGGATCCAGGCTGACT 
CGGGGCCTATCAACGACACCGACGCTAATCCCCGCTACAAGATCCCGGTGGAAGCTGACTTTCTC 
TTTGCTTACTCCACGGTTCCAGGCTATTATTCGTGGAGGAACCCAGGAAAGGACTCCTGGTTTC 
GCAGGCCCTCTGCTCCATCCTGAATGAGCATGGCAAGGACCTGGAGATCATGCAGATCCTGACCA 
GGGTGAACGACAGGGTGGCCAGACACTTCGAGTCCCAGTCTGATGACCCCCGCTTCAATGAGAAG 
AAAAGCTTGGCCAAGGGCGAATTCCAGCACACTGCGGGCGNTACTAGTGGATCCGAGCTCGGTAC 
CAACTTGATGCATAACTTGAGTATTCTATAGTGNCACCT 

GAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCGCC 
CTTATCGCGGGATCCGATCTAGGCAATCAGGGCTGTAATGGAGGCCTGATGGATTTTC 
GTACATTAAGGAAAATGGAGGTCTGGACTCAGAGGAGTCTTATCCCTATGAAGCAAAGGATGGAT 
CTTGTAAATACAGAGCTGAGTATGCTGTGGCTAACGACACAGGGTTTGTGGATATCCCTCAGCA^ 
GAGAAAGCCCTCATGAAGGCTGTAGCGACGGTGAGGCCTATTTCTGTTGCCATGGATGCAAGCCA 
TCCGTCTCTCCAGTTCTATAGTTCAGGTATCTACTATGAACCCAACTGTAGCAGCAAGGACCTCG 
ACCATGGGGTTCTGGTGGTTGGCTATGGTTATGAAGGAACAGATTCAAATAAGGATAAATACTOT 
CTTGTCAAAAACAGCTGGGGTAAAGAATGGGGTATGGATGGCTACATCAAAATAGCCAAAGACCG 
GAACAACCACTGCGGACTTGCCACCGCAGCCAGCTATCCTATCGTGAATTGATGGACAGCGATAA 
AGCTTGGCCAAAAGGGCGAATTCCAGCACACTGGCGGNCGTTACTAGTGGATCCGAGCTCGGTCC 
AAGCTTGATGCATAGCTTGAGTATTCTATAGTG 
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GACCTCGACCATGGGGTTCTGGTGGTTGGCTATGGTTATGAAGGAACAGATTCAAATAAGGATAA 
ATACTGGCTTGTCAAAAACAGCTGGGGTAAAGAATGGGGTATGGATGGCTACATCAAAATAGCCA 
AAGACCGGAACAACCACTGCGGACTTGCCACCGCAGCCAGCTATCCTATCGTGAATTGATGGACA 
GCGATAATAAGGACTTACGGACACTACATCCGAAGGAGTTCATCTTAAAACTGACCAAACCCGTC 
TCTGAGTGAGACCATGGTACTTGAATCGTTCAGGATCCAAGTCACGATTTAT^TTCTGTTGACA^ 
TTTTACATGGGTTAAATGTTACCACTACTTAAAACTCCTGTTATAAACAGCTTTATAATATTGGA 
CACTTAATGCTTAATTCTGATTCTGGAATATTTGTTTTATAAAAGTTC 
CTTTTAAAAATAAATTTTAAGCTCAGTGCAAAAAAAAAAAAAAAAAAAAAAAAAA 
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Cat heps in S 


L03201 1 


3AATGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCGCCC 
ITGGTACATTGAGCTCCCCTTCGGCGATGAAGAAGCTCTGAAAGAAGCAGTGGCCACTAAAGGGC 
CTGTCTCTGTGTGCATCGACGCCAGCCATTCCTCCTTCTTCCTCTACCAAAGTGGTGTCTATGAT 
3ACCCCTCCTGTACCGAGAATGTGAATCATGGTGTTCTCGTGGTTGGCTATGGGACTCTTGATGG 
3AAAGACTACTGGCTTGTGAAAAACAGTTGGGGCCTTCACTTTGGTGATCAAGGATATATTCGGA 
TGGCGAG AAATAAC AAAAATC ACTGCGGG ATTGCTAGC T ATTGCTCTT AC C C AGAAATCTAAACC 
GTTTCTTCTTTTTCTAATC^ 

CCGGAGGACCCAAGTGTGTCGTGATCAGTGTGTACATACTGTGCTAACTGGCTTACAGCTTGTTT 
GTTTTATAACCTTACCTCTCTCTGAAAAGTCTGTAAGCAAGGACGCGCTGAGGAAGGGCGAATTC 
CAGCACACTGGCGGGCGTTACTAGTGGATCCGAGCTCGGACCAAGCTTGATGCATAGCTTGAGTA 
TTCTATAGTGTCCCTAAATAGCTTGGCGTAATCATGGNCATAGCTGGTTCCTGTGTGAAAT 


CCR-5 


U77350 | 


GACCGAGCTCGGGATCCACCTAGAACCGNCCCGGCCAGTGGCTGGAATTCGCCCTTCGGTCCACA 
GGAGACCAGGAAGTTTCTACTGGTTTATGAACTAGGTTGAGTTTTGTC 

TGTAGCTTGAGGGTAGAGATGGTTCTTTTAGAAAAGAAATTAATATAGAGGGCCTAAGGTACGTG 
CATTGTTTTAATATTTATTTGAGATAGATTGGGTCTTTGAAAGCTC 

ATG AGAATGT ATTG ATGAGAC AGTAATTTCTGGC TTCATTC CTATTTAACTAC AATTATTCCC AA 

AC TCTTCTTTAGC C AC AAAAGTTC ACTATTAAAAAATACTG AGC ATTGGG ATTTTTTTTTAGA 

TCGAGTATCTGACCAAAAATAACATCATTTTTTCCCTATATAAAGCAAAATTCAGGCTACTTAAA 

TCAGGTCTTTGTCTTGCCCTGAGAAGAAAGGATGAGACCATGACTAGTTAGGGAATAGATAACCA 

AGCTATATGAATTGCTTTATAGACTTGTATGATCAGTAAGTGCTCTGTGGCCTGAGAAGGAAGGG 

CGAATTCTGCAGATATCCATCACACTGGCGGCCGCTCGAGCATGCATCTAGAGGGCCCAATTCGC 


CD44 metastasis 
suppressor gene 


M61875 j 


TGNGAATTGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCG 
CCCTTATCGCGGGATCCGGTGCATTTGGTGAACAAGGAACCAACAGAGACTCCGGACCAGTTTAT 
GACAGCTGATGAGACCCGGAATCTGCAGAGTGTGGATATGAAGATTGGGGTGTAGTGCCTATGCC 
ACTAACTTGAAAAGACACAACAATTGGAGACATGTCATTACTGGGAGCTGGGACCCTTAACAGAT 

GCAATGTGCTACTGATTATTTTTATTC^ 
TTTTAAAAGTTTGTTTTCCAATTTATGAAAATAGCATTGC 

TTCCTCCTTAGAGGCCTTGCATTACCAGGGTATGCTACCATAGGCTTCTACCAAATGAATACTCT 
TGGTCCCGATTGAACCCAAAGTCCCAGGTAACATCCACCAGCTAAGGATTTCCCCAGAACTTAGA 

GAGATTGGTCTCTGGGAGGAAATTTGAATGGGTCCATATTGCCTCCCAGCAG 
CTTGGCCAAAGGGCGAATTCGACACACTGGCGGGCGTACTAGTGGATCCGAGCTCGGACCAAACT 


CDK102 


r4 
N 

r> 
r- 
f-i 
>• 


TATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCy^GAATTC 

GGCACGAGGCCCCAACTTCCTAATGGATGCAGATAAACACAGTGTACAGATTTTTGATAATCACC 

GTAATAAGTATTTAGACTAGGGGTTAAGCTCTGGTTTTTAGAATTCTTAAGATCATCTGTGGTGA 

AAAGTAGTTATGAAGCTGGGCTTGGTGACAGACCACCATGGGATCCCAGCACTCAACAGGCAGAT 

GTTAGTTCAAGGTCGGTCTACATAGCAAGTTCCAGACCAGCCACAGTTACAAGGTGGGATGCTAC 

CTACTGGTTATGGGAATTTCATAATTCAGGATAACAGTGGTGTATCTTACGTACTTTTC 

GCCCATATCCACCACTGGATTTTGGTCCGAGACTTCGTGTCTTCTCCTTGGAAATC 

TGGGGAGGTGTTGTGCAGTGTCAGATGCAGGAGATTCCCGTCTTTGTCTC 

C C C ACGGTG AAC ACTGGCTTTGC AAC ACTGAG AAATAAAT AAAAC ACTC ATGTCC TTGGTTAAAA 

AAAAAAAAAAAAAAAAAAACACTGCGGCCGCAAGCTTOATTCCCTTTAANG 

CTTWCACTGGCCGTCGTTTTACAACCGCCTNG^ 

AN . 


CDK108 


1 Y17328 I 


TCTTNACATGATTACGAATTTAATACGACTCACTATAGG^AATTTGGCCCTCGAGGCCAAGAATT 
CGGCACGAGGCAGTGCTGTATGTGGACTCCCGGGAGGCTGCCCTAAAGGAGTCAGGAGATGTTCT 
GTTGTCAGGGGCTGACATCTTTGCTGAGCTTGGAGAAGTGGTTTCAGGM 

GTGAGAAGACCACGGTGTTCAAGTCTTTGGGGATGGCAGTGGAGGACCTGGTCGCAGCCAAATTA 

GTGTACGATTCGTGGTCATCTGGCAAGTGAGCAGAAGGAGCTGTGCCTGTGCTGGATGGACGTCA 

CGGCTCAAACGCTGGCTCAGTGTCTAGATCAAAGGAGGCCTAGTCCCCAGTGAACGGGAGTGAGA 

GTCACTCATAAGTATTGACATCCCTATTCATGTTTGTGGTTGGATAGCTAAACCCTTCTGTTAGG 

GGGTGATGGCCACATTACCTACCCTTGATCTTACTAGCCTTGTGTGTCTCTGAAATAAATCATTT 

CCAGTTAAAAAAAAAACCTTTGCGGCCGCAAGCTTATTCCCTTTAGTGAGGGTTAATTT^ 

GGCACTGKXTCGTCCGTTTTACAACGTCGTGACTGGGAAAACCCTGGGCGTTACCCNACTTATCGC 
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GCTATGCCATGATTACGCCAAGCTATTTAGGTGNCCTATAGAATACTCAAGCTATGCATCAAGCT 
TGGTCCGAGCTCGGATCCACTAGTAACGGCCGCCAGTGTGCTGGAATTCGCCCTTATCGCGGATC 
CCCCAGGCCAGTGAGCTTTACTTGCAGTGTAAAAGGAGGAAAGGGGTGGAAAAAAAACCGAATTT 
CTGCATTTAACTACAAAAAAAAGTTTATGTTTAGTTTGGTAGAGGTG 

T AAAG AAC C C CC TTTC C GTGC C ACTGGTGAAT AGGG ATT AATG AATGGGAAG AGTTC AGTC AG AC 

CAGTAAGCCCTTCTGGGTTTGAGTGTGTTCCCATGTAGGAGGTAAAACCAATTCTGGATGCATCT 

AAGCTTCCATGAATAACTTTAATTCTTAGCATAATGATGGCCTTGGATTGTCTC 

TATTAAATAACATCGAGTAACATCTGCATCAGGCCCTCATAGAACATTCAGTTGAGTTGGGAGTA 

AACTGAAAAGACAAATGTGTTGAAGGATATGCCAGGGAATCTGGCTAAAGCCTAATACAGGAGCA 

TCTTCATCCCAGTGTTGATGCTGGACGTACCTCAAGGGCGAATTCTGCAGATATCCATCACACTG 

GCGGCCGCTCGAGCATGCATCTAGAGGGCCCAATTCGC 
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GANAACCGTTCGTATAAANCCNTGCTCGANTCGGCCGGCAGNGTGATGGATCTNN^^ 
CCCTTATNGNGGGANCCCAACCGAGGATCNAGTCAACNGATCTNTATNTTGGGCTAGGGAGTC 
ATTGATTGATCNTTGGAAATCTNANGTGNAAGTTTTCANTCCTAAA 
TTCTGTTNCTANNACCTTGNTTACAANGNTTCCTTC 

CATTTGGANCNCNGCTGGNNNANAGNTTTCCTCTCACAACNATGGNACNCATANANNGGNGANTA 
AACCGCACC 



TCCAGGGAAAGGATATAGGCAAAATTCGCNCTTATCGCGGGATCCGNGTTTCTACGCNNATAAAC 
GAGGGGTCATTCCTTCTNTTCCAAGTAGNGCNTCNCTGNCGGGNGGCAANCNTTTC 
ATTCCCCAAGATGAOmrCTTTCTCNAGCATGGGNTTCNCCNGTCTC 
GCAGATCTGCNGGTNTCTAGTGCCATTTI^ATCNNATTGGTGCANCT 

TGCAGTGGCTGGNGCNGCNCACTCTGGTCTCNNCNGTGGCCCCATCGCAGACNAGAGCGCCCCAT 
CCTTCGGNCNTCCCCACCCCGTCGCCGGGKGCTTAGNCCATANNGNNAGTGGTGAANNCNATGTC 
AGGCAGCAGAGCNCANAGCATCGGCAGAAGGGGCAAAGTAGAGCAGCTATNTCCTGAAGAGGAAG 
ANNNTCGC AGNATC CNNAGGGAAAGNAATAAG ATGGC TGC AGTCTAGTGC CGAAATCGG AGNAGG 
ANANTGACAGATACGCTCCAANNGGAGNCNGATCNNCTimOTAGTAGAAGTGTGNGTTGCAGAC 
CGNGATTGCCNATCTNCAGAAAGAGAAGGAAAATTCAGTTTTfTOJNCAGCCTCNACCTCTO 
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GAAATTGNGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGT^TTCGC 

CCTTCATAGCTGGGGCCAGAGCTTC^TCACTTTCAGAAAGCAATGTCCTTTGTATW 

AATG AAG ATATTC C AATTGGC AGG ATATTTTTCCTAAGG AAATTGCTTTATATTTTTATGAAAAC 

TACCAATTAATTATGAAAGGGCTTGAAATTCACGTTTTAGTGAAAT^^ 

AGGTTCTTCAGGTGTGAAACTGTATTATAAAAATGTTGTAATGGGTCACACTGTG 

AGGTAAAGGAAACTATGTTTCAGCCTTTTCTGTGTCTATGAGCTTC 

TAGAAACACTGGGGAGGCTTCGACATGCTCTCGCTATATTTTATTTTACTO 

CATTTCAGTTTTCAACTACCTTATCTTTCCC^ 

TTTTAGGAATTAACAAGGCACCTCCCAGAACCCTACCCTGAGACTAAGGGCGAATTCCAGCACAC 
TGGCGGCCGTACTAGTGGATCCGAGCTCGGTACCAAGCTTGATGCATAGCTTGAGTATTCTATAG 
TGTCCCTAAATAGCTTGGCGTAATCATGGGCATAGCTGNTTCCTGTGTGAAATTGNTATCCGCTC 
AC 
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AATACTCAAGCTATGCATCAAGCTTGGTCCGAGTTCGGATCCACTAGAACGGCCGCCAGTGTGTN 
GAATTCGCCCCTTCGCGGGATCTTGGAAGTCCTACCCAACACTGAAAATCTCTGAGAAAATGATT 
CCAGTGGTTGCTGAGAAGTACTTCGGAGGGACAGATGACCCTGCCAAAAGGAAAGACCTGTTCCA 
GG AC TTGGTTG C AG ATGTG AT ATTTGGTGTC C C ATC AGTGATGGTGTCTCGAAGC C AC AG AGATG 
CTGGAGCCCCCACCTTCATGTATGAATTTGAGTATCGCCCAAGCTTTGTATCAGCCATC 
AAGACAGTGATCGGAGACCATGGTGATGAACTCITCTCAGTATTTGGATCTC 
TGGTGCCTCAGAAGAGGAGACCAATCTCAGCAAAATGGTGATGAAATACTGGGCCAACTTTGCTC 
GGAATGGGAGCCCTAATGGGGGAGGGCTGCCCCATTGGCCAGAATATGACCAGAAGGAAGGGTAC 
CTGAAGATTGGTGCCTCAACTCAGGCAGCCCAGAGGCTGAAGGACAAAGAAGTGGCTT 
TGAGCTCACTGCAGGGCCAAGGGCGAATTCTGCAGATATCCATCACACTGGCGGCCGCTCGAGCA 
TGCATCTAGAGGGCCCAATTCT 
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CCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCGCCCTTATCCAG 
CTGATCCAGAACCATTTTGTGGACGAGTATGATCCCACTATAGAGGACTCCTACCGGAAACAGGT 
AGTCATTGATGGGGAGACGTGTTTACTGGACATCTTAGACACAGCAGGTCAAGAAGAGTATAGTG 
CCATGCGGGACCAGTACATGCGCACAGGGGAGGGCTTCCTCTGTGTATTTGCCATCAACAACACC 
AAGTCCTTTGAAGACATCCATCAGTACAGGGAGCAGATCAAGCGGGTGAAAGATTCAGATGATGT 
GCCAATGGTCCTGGTGGGCAACAAGTGTGACCTGGCCGCTCGCACTGTTGAGTCTCGGCAGGCCC 
AGGACCTTGCCCGCAGCTATGGCATCCCCTACATTGAAACATCAGCCAAGACCCGGCAGGGTGTG 
GAGGATGCCTTCTACACACTAGTACGTGAGATTCGGCAGCATAAACTGCGGAAACTGAACCCGCC 
TGATGAGAGTGGCCCTGGCTGCATGAGCTGCAAGTGTGTGCTGTAAGGGCGAATTCCAGCACACT 
GGCGGCCGTTACTAGTGGATCCGAGCTCGGTACCAAGCTTGATGCATAGCTTGAGTATTCTATAG 
TGTCACCTAAATAGCTTGGCGT 



GCGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCG 
CCCTTATCGCGGGATCCTCGTTCCTCCAGTCCGAGAGTGGCGCCTACGGCTACAGTAACCCTAAG 
ATTCTGAAGCAGAGCATGACCTTGACCCTGGCCGACCCGGTGGGCAATCTGAAGCCGCACCTCCG 
AGCCAAGAACTCGGACCTTCTCACGTCGCCCGACGTCGGGCTGCTCAAGCTGGCGTCGCCGGAGC 
TGGAGCGCCTGATCATCCAGTCCAGCAATGGGCACATCACCACTACACCGACCCCCACTCAGTTC 
TTGTGCCCCAAGAACGTGACCGACGAGCAGGAGGGCTTCGCCGAAGGCTTCGTGCGCGCCCTAGC 
TGAACTGCATAGCCAGAATACGCTGCCCAGTGTCACCTCCGCGGCACAACCTGTCAGTGGGGCGG 
GCATGGTCGCTCCCGCTGTGGCCTCAGTAGCTGGCGCTGGCGGCGGCGGCGGCTACAGCGCCAGC 
CTGCACAGTGAGCCTCCGGTCTACGCCAACCTCAGCAAAGCTTGGCCAAGGGCGAATTCCAGCAC 
ACTGGCGGCCGTTACTAGTGGATCCGAGCTCGGTACCAAACTTGATGCATAGCTTGAGTATTCTA 
TAGTGTCACCTAA 



GAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCGCC 
CTTCGCGGGATCCACCAGGCTCAACAGGCCATGGACGTCCAGCTCCATAGCCCAGCTTTACAGTT 
CCCGGATGTGGATTTCTTAAAAGAAGGTGAAGATGACCGCACAGTGTGCAAGGAGATCCGCCATA 
ACTCCACAGGATGCCTGAAGATGAAGGGCCAGTGTGAGAAGTGCCAAGAGATCTTGTCTGTGGAC 
TGTTCGACCAACAATCCTGCCCAGGCTAACCTGCGCCAGGAGCTAAACGACTCGCTCCAGGTGGC 
TGAGAGGCTGACCCAGCAGTACAACGAGCTGCTTCATTCCCTCCAGTCCAAGATGCTCAACACCT 
CATCCCTGCTGGAACAGCTGAACGACCAGTTCAGCTGGGTGTCCCAGCTGGCTAACCTCACACAG 
GGCGATGACCAGTACCTTCGGGTCTCCACAGTGACAACCCATTCTTCTGACTCAGAAGTCCCCTC 
TCGTGTCACTGAGGTGGTGGTGAAGCTGTTTGACTCTGACCCATCACAGTGGTGAAGCTTTGGCC 
AAGGGCGAATTCCAGCACACTGGCGGCCGGTACTAGTGGATCCGACTCGGTACCAAGCTTGATGC 
ATAGCTTGAGTATTCTATATGTC 



NTNCNNANGNGCCCTNTANATGCTGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCG 
CCCTTCGCGGGATCCTGGAGAAGAGGCAAACCCCTGCCAAGAGGTCCGAGTCAGGGTCATCCCCA 
TCAAGAGGCCACAGCAAACCTCCACACAGCCCACTGGTCCTCAAGAGGTGCCATGTCTCTACTCA 
CCAGCACAATTATGCAGCACCCCCCTCCACAAGGAAGGACTATCCGGCTGCCAGGAGGGCCAAGT 
TGGACAGTGGCAGGGTCCTGAAACAGATCAGCAACAACCGCAAATGCTCCAGCCCCAGGTCCTCA 
GACACCGAGGAAAACGACAAGAGGCGGACACACAACGTCTTGGAACGTCAGAGGAGAAACGAGCT 
GAAGCGTAGCTTTTTTGCCCTGCGCGACCAGATCCCnXSAGTTGGAAAACAACGAAAAGGCCCCCA 
AGGTAGTTATCCTCAAAAAAGCCACCGCCTACATCCTGTCCGTTCAAGCAGATGAGCACAAACTC 
ATCTCAGAAAAGGACTTACTGAGGAAACGACGAGAACAGTTGAAACACAAAAGCTTGGCCAAGGG 
CGAATTCCAGC^CACTGGCGGGCCGGTACTAGTGGATCCGAGCTCGGTCCAAACTTGATGCATTA 
GCTTGAGTATTCTATAGTGTCACCTAAATAGCTTGGGGTA 
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TAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCGCCCTTATCGCGGAAT 
TCGTCTGTGGTGACCTGGGCATGGGTGACAGGGCTCTTACTTGCTC 

TGCCCCGCCTCCCCCCTTTCCTGCCCTCCCTGGCTACTGGGTCGCTAATCTTTCAGGCCATGGAT 
CCGGAGGAGAGTGGTCTATAGGCTCCACCAGCCCTGTCCTGAGACAACAGAGGGGGTGAGGACAC 
TGGAGACTTTCCCGTGGGGCTTACTTAGCCTTCTAGTTACAGACTATTTCCACACTAGAAAATAC 
GTATTTTTAAATAGAAGAAAAACACAGAAACAAACAAAAGGCATTCTCCTACCCCTCCATCTTAA 
ACATACATTATTAAAGACAGAAGAACAAATCCAACCCATTGCAAGAGGCTCTTTGTGGGTGCCTG 
GTTGCATAAGAACAGGAGGAGCCCCAAACCCACCTTTGGAGCTTCCCTGCACAGGAACCCCTTCT 
TCCCTCCAAGAAAGCTCAGAGGGAGCACTGCCAAG^TTGGCCAAGGGCGAATTCCAGCACACTGG 
CGGCCGTTACTAGTGGATCCGAGCTCGGTACCAAACTTGATGCATACTTGAGTATTCTATAGTGN 
CACCTAAATAGCTTGGCGTAATCATGGCAT __— 
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TTGGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTC 
GCCCTTCGCGGGATCCAGGATCTGATGCGCCAGTTTCTAAGCGGCCTAGATTTCCTTCATGCAAA 
CTGCATTGTTCACCGGGACCTGAAGCCAGAGAACATTCTAGTGACAAGTAATGGGACAGTTAAGC 
TGGCCGACTTTGACCTAGCCAGAATCTACAGCTACCAGATGGCCCTCACGCCTGTGGTTGTTACG 
CTCTGGTACCGGGCTCCTGAAGTTCTTCTGCAGTCTACATATGCAACGCCTGTGGATATGTGGAG 
TGTTGGCTGTATCTTCGC AGAG ATGTTTC GC C GG AAGC CTC TC TTCTGTGGG AACTCTGAGGC TG 
ACCAGCTGGGCAAAATCTTTGATCTCATTGGATTGCCTCCAGAAGACGACTGGCCTCGAGAGGTC 

TCTCTTCCTCGAGGAGCCTTTTCCCCCAGAGGAC^ 

GGAGGAATCTGGAGCGCAGTTGCTGCTGGAAATGCTGACCTTTAATCCACTTAAGCGAAGCTTGG 
GCAAGGGCGAATTCCAACACACTGGCGGGCGGTACTAATGGATCCGAGCTCGNACCAACTTGATG 
CATAGCTTGAGTN 
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TGNGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTC 
GCCCTTCGCGGGATCCTGGAACTGATGATGATGAAGGCCCTTAAGTGGCGTTCAAGCCCCC 
CATTGTGTCCTGGCTGAATGTTTATGTCCAAGTGGCCTACGTCAACGACACGGGAGAAGTGCTGA 
TGCCTCAGTACCCACAGCAGGTCTTCGTGCAAATCGCAGAGCTTTTAGACCTGTGCGTCCTGGAT 
GTTGGCTGCTTAGAATTTCCTTATGGTGTCCTCGCTGCCTCTGCTTTGTATC 
GGAGTTGATGCAGAAGGTCTCAGGTTATCAGTGGTGTGATATAGAGAAGTGTGTCAAGTGGATGG 
TTCCATTCGCCATGGTTATCCGGGAGATGGGAAGTTCCAAGCTCAAGCACTTCCGGGGAGTTCCC 
ATC^AAGACTCCCACAACATCCAGGCCCACACCAACAGCTTGGACTTGCTGGACAAAGCCCAAGC 
AAAGAAAGCCATATTGTCAGAACAAAATAGGATTTCTCCTCCTTCGAGTGGGAAACTTGGCCAAG 
GGCGAATTNCANCACACTGGCGGGCGGTACTAATGGATNCCAGCTCGGACCAAACTTGATGCATA 

ACTTGAGTATTCTAT 
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GNOTTCQ^GGGGATGGATATNTGCAGAATTCGCCNNNATC 

TNACNGC AAG ACTCC C AGGTCTTNTNAAAGNC AGAG ATTTCCGG AGNCCTAAC TC AGTNATGGGC 

TTNGACACANAAACATTTTCCCTTTGCTGTGAATTTTACTGCAGAGATTGG 

GACAGGNGNAGCATCTCGGATGTGTCGGCCTTGAGNNGT^^ 

AGAGGNAAGGNANTGTCCCNGCTGGCNNACTTNNTNTGNTCCGNATNAGNOT 

GTTTCAGACGTGATNAGAATGCAAAAAGATTGTGTTNNANANAGTGTGNTGCA 

TTACNNCCTTCCAOTTTCTNCAGNTCTATCANTCCCTCATTNGGG 

AGNNACNNTl^G AATTNTG ANAGNGT ACNAGC C CTNNTNAAGGNGTGC C ACNG CNGG ATC AT ATT 
TTCTAAGNCAAAGCCNTCTGNGATGGNNCTNGCGATCATTN^ 

ATGTGNAGNNCACNGAAGGAGNANNATGTNTTCAAGCNCATTCCCAGATAAGTGGCCNGTATNTN 
NCCTTCTGGCAAGAGCCNGNOTCCTAGTGTCTTCCGAGATCTTC 
NNNNANNNNNNC 
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GCGAATTGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCGC 
CCTTCGCGGGATCCGTTGCTGGGGGAAGGAATGTTCCAATCGCTGTACAAGCAGTGGCAAAGGCC 
TCCATTGACCAGAGCAGAGAGATGAAATACCAGTCACTCAATGAGTACCGCAAACGCTTCTCCCT 
GAAACCTTACACATCGTTTGAAGAACTTACAGGTGAGAAACGGTTGCTGAAATTTCTAAAATGAC 
CAAGGATTAAATGAGAAAAGAGAAGGTGAGAGGGAATTTAGTGAAGGAATAAACTGTCTTCCTCC 
TCTTCCTCTTTTTCGTCTTCTTCTGGAAACAGAAAAGGATTGAGTITA 

GAATGGGGGTCGGCTAGTGAATGTGTACTCGTCATCC^GGGCTATTGTCGGCAATTCCCAGTTTG 
ATATAGCCATATTGTCTGTACCTTCCAGGAGAGAAAGAAATGGCTGTCAGAGTTTGAAAGCCCTCT 
ACCATGACATCGATGCCATGGAACTGTATCCCGCCCTGCTGGTGGAAAAGCCTCGTCCAGATGCT 
ATCTTTGGGGAGACCATGGTAGAACCTTGGAGCTTCATTCTTCTTTGAAAG 
CCCATCTGGTC 



TATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC 
GGCACGAGGGCGGAACCATGGCCAGCCCGCTGCGCTCCTTGATGCTACTGCTGGCCGTCCTGGCC 
GTGGCCTGGGCCGGAACCTCCAGGCCACCCCCGCGATTGTTGGGAGCTCCGCAGGAGGCAGATGC 
CAGCGAGGAGGGCGTGCAGCGAGCGTTGGACTTCGCCGTAAGCGAGTACAACAAGGGCAGCAACG 
ATGCGTACCACAGCCGCGCCATACAGGTGGTGAGAGCTCGTAAGCAGCTTGTGGCTGGAATAAAC 
TATTATTTGGATGTGGAGATGGGCCGAACTACATGTACCAAGTCCCAGACAAATTTGACTAACTG 
TCCTTTCCACGACCAGCCCCATCTGATGAGGAAGGCACTCTGCTCCTTCCAGATCTACAGCGTGC 
CCTGGAAAGGCACACACACCCTGACAAAATCCAGCTGCAAAAATGCCTAAGAGCTGAGTCTCATA 
GGACCATGCCAATGGTCCCTTACTTGTTCCCCTACCCTGTAGTGTTTTATCCCTGAAAAGGGTGC 
TCCAGCTCTGGAGGGCATCTNCGGGGTGTTCCCACCAGGAGACAGTAAAGAAACTGCTGCAGGCA 
GGTTCTGCACGTCAGAACAGCTGTCCCCTGGTTCTCTTCTCCTTGCAGTAN 
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Cytochrome c oxidase 
subunit II 
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^GGCTCGTATGTTGGGGGAATTGTGAGCGNATACCAATTTCACN^ 

3ATOACGCCAAGCTATTTAGGTGACACTATAGAATACTCAAGCTATGCATCAAGCTTGGTACCGA 
5CTCGGATCCACTAGTAACGGCCGCCAGTGTGCTGGAATTCGCCCTTCACACAAGCACAATAGAC 

3C CC ATG AAGTAGAAAC AATTTG AACAATTCTCCC AGC TGTC ATTCTTATTCTAATTGCC CTTCC 
TCCCTACGAATTCTATACATAATAGACGAGATTAATAACCCAGTTCTAACAGTAAAAACTATAG 
3ACACCAATGATACTGAAGCTATGAATATACTGACTATAAAGACCTATGTCTTTGACTCCTACATA 
\TCCC AACC AATGACCTAAAACC AGGTGAACTTC GTCT ATTAG AAGTTGAT AATCGGGTAGTC TT 
ftCCAATAGAACTTCCAATTCGTATACTAATCTCATCCGAAGACGTCCTGCACTCATGAGCCATCC 
^TTCACTAGGGTTAAAAACCGACGCAATCCCCGGCCGCCTAAACCAAGCTACAGTCACATCAAAC 

:gaccaggtctattctatggccaatgctctgaaatttgcaagggcgaattctc 
acactggcggccgctcgagcatgcatctagagggcccaattcgc 


Cytochrome c oxidase 
subunit IV 


X14209 1 


CTNCGAAANNGNACNGNANGOTANNGAAC^CNAGCANTTNACTTC 

NGGTNT AAAC CTNANNGN^n^TNT AAAANNNANGNT AATAANT ANNN AGGNNNCTNGOT 

GNNANCNTNGAGTNANATATATTTTNGNGNANTNNANACGTNG 

TNANNTGTGAATNATCAAANGANCANCCTTCANNGANCNANANAOTGNANAT^ 

NNGNATNAAAGGTTGNGTCCNNNGCTCNGGA^ 

ATTNGCCCTTATCANGGGATCCANTANAATNCGGGTGTGCCTTAGGGCCACATGGGAGTGTTNTA 
AANAGTGAAGANTATGTCTCCNCGTNATA^ 

GCNCACGTCAANGTTNATGTCTGCCAGCCAAAAGGCCNNTAAGNAGAAGGAGAAGNCCCAGTGGA 
GC AG C CTTTC C AGGNNTNAAAAAGTCC AATTGTNCCGC ATC C AGTTNANCGAG AGCTTNGCTC AG 
ATNAACAAGGGCACCAATGAGTGGAAGACAtm3GTGGGCCTGGCCATGTTC 
TGCGCTTGTGCTGATCTGGGAGAAGAGCTACGTGTANGGCCCGTCCCTCATACCTTTNATCGTGN 

TTGGGACATAAGACNGCGCANGCNGTNTNAACNTTC . 


Cytochrome P450 14 DM 


D55681 | 


GCGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCG 

CCCTTCAGGACATCAGGTGTGTGTTTCTCCAACTGTCAATCAAAGACTGAAAGACTCCTGGGTAG 

AACGCTTGGACTTCAATCCTGACCGCTACCTACAGGACAACCCAGCGTCGGGAGAGAAGTTTGCC 

TATGTGCCGTTTGGAGCCGGGCGCCATCGTTGTATTGGAGAAAATTTTGCCTATC 

GACAATTTGGTCCACTATGCTTCGTTTATATGAATTTGACCTCATCAATGGATATTTTCC 

TGAATTATACAACAATGATTCATACCCCAGAAAACCCAGTAATCCGTTACAAACGAAGATCAAAA 

nrnaraiiarraarA ART: AGC C AGTGTGGAGACGGGACTGC AAGCTGC AGCTTGGC AGAGAATGAA 

GCTTTGACACAGCTTTCATACTGTACTGTTTTTTAGGTGTGTGGT^ 

TAATGTTTTATTAACTCGGTGATTTTTGTCAGACCT 

GCGGNCGTACT AGTGGATC CG AGCTCGGT AC C AAGCTTG ATGC ATAGCTTG AGTATTC TATAGTG 
TCACCTAAATAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAAATGGTATCCGCTC 

CAATTN 


Cyto- 
chrome 
P450 1A1 


X00469 1 


TCTGGTCCTCAGCATCTCCAGGCTTAGACTGTCCTGGATGCTCACCAGAC 


Cytochrome P450 

1B1 


U09540 1 


NGGTGACACTATAGAATACTCAAAGNTATC^ 

ACGGCCGCCAGTGTGCTGGAATTCGCCCTTATCGCGGGATCCAGCGGAAACCAAGTGGCCTGAAG 
GTGAGGCGGGCTTACCAATTCATGGCTCCTCACCGGCCAGCAGCTGGAGATCCTGAAGTATTTTG 
AAATTGAAGAGTAACAGGGCCCAAGGAATTTGCATACTGTTC 

TGCACACAAATCAGCATGTGTGTACAGCTATCCAACAAAATATTTCAGTAACTCTC^ 
GTCAATTTG AAAGGG AACTTCTATGTGC AGAAATTGGC CCC ATAGG AAAC C AC AGT AAGC AGAGG 
CTTAGGATATATATTTTCAAGATTCAAAGAAGTGATTTAAGTGTAAAATATAAAGAGCAGAAA 
CTACCAAGAGACAAATGAGGCCACTCCCTIX5TGGCCCTGGACGAGGTTTTCTTTCTG 
TGTCCCTCCCACTCTAGAACGGACCATAAAGCCGTTTTGCTCCCCCTCAAAAGCTTAAGGGCGAA 

TTCTGCAGATATCCATCACACTGGCGGCCGCTCGAGCATGCATCTAGAGGGCCCAA 


Cyto- 
chrome 
P4S0 2A3 


1 M33190 1 


ATATATATTTCAAAGGTAGAGCCAGAGAAGGGGGAAATA 
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CAACATACCAGATCTGCTTCTCAGCTCGGTGATCCGGCTGAGGCAGCCAT 



ACTCTCTAAGCTCTCATCTGTAATGTCTCTTCTGAGGGTCCTGTCTACTT 



TATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC 
GGCACGAGGGCCAAAGTTCATGAGGAACTTGACCGTGTGATTGGACGCCACCAACCCCCCAGCAT 
GAAGGACAAGATGAAGCTGCCTOATACCGATGCTGTATTGCATGAGATTCAAAGATACATCACTC 
TCCTTCCTTCCAGTCTGCCCCATGCTGTGGTCCAGGACACAAAATTCAGAGACTATGTCATCCCC 
AAGGGTACTACTGTACTCCCGATGCTGTCTTCCGTCATGCTGGATCAAAAGGAGTTTGCCAACCC 
AGAGAAGTTTGATCCAGGACACTTTCTGGATAAAAATGGCTGCTTCAAGAAGACAGACTACTTTC 
TTCCCTTCTCCCTTGGAAAGCGGGCCTGTGTTGGTGAGAGTTTGGCCCGGATGGAGCTCTTCCTG 
TTCTTCACCACCCTTCTGCAAAAGTTTTCCCTGAAGACTCTGGTGGAGCCCAAGGACC 
CAAGCCTATTACTACCGGGATTATCAATTTGCCGCCACCTTACAAGCTGTGCCTTGTTCCTAGM 
AAGGGTTTATCCTTCTAATCATTCAAATCAAAGTAATT^ 
TATGCTAAGCACTGNCTANGGCAGAAACATCATGCATG 



TCTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAAT 
TCGGCACGAGGCTCCCCGTGTTCCTTTTCTGGAACCTCAACTGGCTCCCACTGCTC 
AGTTACCGTTTGGCGATCCCTCTCCTGCTAACATGCCGTTCGTTGAGTTGGAAACAAACTTGCC^ 
GCTAGCCGCCTACCCGCAGGGCTGGAGAACCGGTTGTGTGCGGCCACAGCCACCATCCTGGACAA 
ACCCGAAGACCGCGTGAGCGTGACGATACGACCGGGCATGACCTTGTTGATGAACAAATCCACAG 
AGCCCTGCGCCCACCTCCTGATCTCTTCCATCGGTGTTGTGGGCACCGCGGAGCAGAACCGCAGC 
CACAGCTCCAGCTTCTTCAAGTTCCTCACCGAGGAGCTGTCCCTGGACCAGGACAGGATCATTAT 
CCGATTCTTCCCCTTGGAGCCCTGGCAGATCGGAAAGAAAGGAACTGTTATGACGTTTCTGTGAT 
GGAGACAAGGAACGCAGGGCGTTTGCTTGAGCCTGTCCAGAGCCCTTCCAGAGAGGCCTCCTGGC 
AGATACGATACCAGATCCCTCTTTTGCATAAGTGTCTGTGATCTCACTGACCTGGTTTCCTCTCC 
CCCAGCCTCGTGGAACGAGGAGAGCAAATTAAAGAAGAGAGCC 



ANNNCNTCTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCC 
AAGAATTCGGCACGAGGCTCGGATACATCCGCATCTCAGACACCAACATAACTGCTATTCCTCAA 
GGTCTGCCCACTTCTATCAGTGAACTGCATCTGGATGGCAACAAGATCGCCAAAGTTGATGCAGC 
CAGCCTGAAAGGAATGTCTAATTTGTCTAAGCTGCWTTTGAGCTTC 

AAAATGGCAGTCTGGCTAATGTTCCTCATCTGAGGGAGCTCCACTTGGACAACAACAAACTCCTC 
AGAGTGCCTGCTGGGCTGGCACAGCATAAATATGTCCAGGTCGTCTACCTTCATAACAACAACAT 
CTCCGAAGTTGGGCAGCATGACTTCTGCCTCCCTTCATACCAGACTAGGAAGACTTCCTACACTG 
CCGTGAGTCTTTATAGCAACCCTGTCCGGTATTGGCAAATTCACCCACACACCTTCAGATGTGTC 
TTCGGGCGCTCTACCATTCAACTTGGGAACTACAAGTAACTCCCAAACAGCCTCATTTTTATAAT 
CGGG AACAAAAAAAC C AATCTGTC AATATT ATG CTAAAAAGAAAAAAAATATTTTG AAAAAG AAA 
GAATGCTAGATTCTGGGAAATTCAAGTATAGCGCGGATGCCTT 



to A 
run 
to <o 

CU 

«w U 



AGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCGC 
CCTTCGCGGGATCCATGTCGGCGTCGGTAGTGTCCGTCATCTCCCGGTTCCTGGAGGAGTACTTG 
AGCTCCACTCCACAGCGGCTGAAGTTGCTGGATGCCTATCTCCTTTATATATTGCTGACCGGGGC 
GCTGCAGTTCGGCTACTGTCTCCTCGTGGGCACCTTCCCCTTCAACTCTTTCCTCTCTGGCTTCA 
TCTCTTGTGTGGGCAGCCTCATCTTAGCGGTTTGCTTGAGAATACAGATCAACCCCCAGAACAAG 
GCGGATTTCCAAGGCATCTCTCCTGAGCGAGCCTTTGCTGATTTTCTCTTTGCCAGCACTATCCT 
GCACCTCGTCGTCATAAGCTTGGCCAAGGGCGAATTCCAGCACACTGGCGGCCGTTACTAGTGGA 
TCCGAGCTCGGTACCAAGCTTGATGCATAGCTTGAGTATTCTATAGTGTCACCTAAATAGCTTGG 
CGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACACAACATA 
CGAGCCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAAGTGAGCTAACTCACATTAATT 

GCGTTGCGCTCACTGCCCGCTTTCN 
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Diacylglycerol kinase 
zeta 


C 
C 
2 
C 

00 

S c 
° 


s a naTTrMCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCGC 1 

'otaaggggaccatggatcaacgaggga^ 

^CTC^CATTCCTGTCAGATGGCTATGGGGGGACCCTGTCACAGGGAAGGAGCCCCGTCCCACC 
CCTCAGAAGCCGTTCAGATCTAGGGCTGGACTCTAAGGAGCTGGACTCTCACCTGTCCCTGGT^ I 
rCM^GGGAACAGGAAACAAGCTGGGCTGACTGGGTCCCTCCCTTCAGGGCGGCCTCCC^ 
'ACAGCTGATGGAATGGCTGGACAGCTCAGTCAGGGAGGCCTGCTCTCAGCAGGACTTTCTAAAG 
; CACCTCATCCCTTGGGCTCTTTGGAAGGTTCTGGGTGCCTAGCCCTCCTCTCTC 

"ttcwkatcccagaaactcaagagcctgctgtatttc 

: CCTGGTGATCCTCCTCATGCACCCAGTCATTTCATTTCCGACTGTATGGCCTGGGGTGA^GGC 
SAATTCAACACACTC^GGCCGTTACTAATGGATCCGACTCGGTACCAACTTGATGC^TAAC'^ 
^A^CTATAGTGTCCCTAAAATAGCTTGGCGTAATCATGGCATAGCTGNTTNCTGTGTGAAATl 

TGTN I 


Dime thy larginine 
dimethylaminohydrolase 


en 

<N 
«N 
CM 

o 

i 


TTrTNAGAACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAI 
A^GGCACGAGGATGATTTATTACAATCATCTCCTCAATAGACACACTAmA^A^AA 
TOAAAAATTTCTTCCCAAACCTTTCCTGCACCTCCCTCACCCAAAACTATAGC^ 
GAATAACCCTTGAGAATCAAAATGAACGAAAATCTATTTGCCTCTTTCATTACCCCCACAAT 

AGGTCTACCAATTCTTCTGAWATTATTATC^ 

TAATCAGCAACCGACTACACTCATTTCAACACTGACTAATCAAACTTATCATCAAA 
T^TCCA^CACCAAAAGGACGAACCTGAGCCCTAATAATTGTATCCCTAATTATATTTATCGG 

CTCAACCAACCTTCTAGGGCT^ 1 
AAACGTCACCTTACCTCACTGGAGGATAATGAATCCTAGTCATTAGAGAAAATGTTTTAGCTOAT 

CTAAMTTACAATGGATreCTTTTATTATCACGTATC 

fAATOTAAATGTAGAAAAAGTGCTAGATATCAGAGATTTNCAT 1 


DNA binding 
protein inhibitor 
ID2 


D10863 | 


GGAATTCGCCCTTCGCGGGATCCATGAAAGCCTTCAGTCCGGTGAGGTCCGTTAGGAAA^CAGCl 

CTGTCGGACCACAGCTTGGGCATCTCCCGGAGCAAAACCCCGGTGGACGACCCGATGAGTCTGCTl 

C^CAACATGAACGACTGCTACTCCAAGCTCAAGGAACTGGTGCCCAGCATCCCCCAGAAC 

AGCTG^CCAAGATGGAAATCCTGCAGCACGTCATCGATTATATCTTGGACCTC 

GACTCGCACCCCACTATCGTCAGCCTGCACCACCAGAGACCTGGACAGAACCAAACGTCCAGGAC 

G^CGCTCACCACCCTCAACACGGACATCAGCATCCTGTCCTTC 

AGCOTATGTCGAATGACAGCAAAAAGCTTGGCC 

GCGGCCGCTCGAGCATGCATCTAGAGGGCCCAATTCGCA . 1 


DNA topoisomerase X 


<N 

CO 

r- 
o 
•<* 

a 


AATTGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCGCC^ 1 
^TCGCGGGATCCTGCAGCAGCAGTTTAAAGAGCTCACAGCCCCTGATGAGAATGTACCAGCAAA 

caaagacctttgagaagtcaatgatgaacttgcagtctaagattgatgccaagaaagatcagctaI 

^ATCCTCGAAAGGACC^ 

SgaagSag^aItcaaaaaagaaggctgt^^^ 

AGGTTCAAGCCACAGACCGAGAGGAGAACAAACAAATTGCCTTGGGGACCTCCAAACT 
C^AC^AGGATCACAGTGGCTTGGTGCAAAAAATGGGGGGTCCCAATCGAGAAGATTTACAA 

CAAA^CCCAGAGAGAGAAGT^TCCTTGGGCCATTGATAAGCTTGGCC 

ACAC^GGCGGCCGNTC^GTGGATCCGAACTCGGTC 


Dyne in light chain 1 


1 U66461 1 


" AGNGAGGCTTGATCAGCGAGCTTCTAGCATTTAGGTGACACTTATAGAATAGGG^^AG^ 
CATGCTCGAGCGGCCGCGATATCGAATTCGCCCTTCGCGGGATCCCTGTCGCCTCTGCTC 

gcggcgcSgcacctoccctaggagc^ 
^gaccSaaggcggtcatcaaaaatgcagacatctcggaagagatc 

GTCCGCTACTCAGGCGTTGGAGAAGTACAACATAGAGAAGGATATCGCGGC^^ 

ag^gacaagaagtacaaccccacctggcactgcatcgtgggccggaacttcggto 
acacacga^caaacacttcatctacttctacctcggtcaggtgg^ 

TGGOTAA^GCATGGACTCTGCCAAACACCC^ 

A^A^TAeCAGAGAC^^ 

TTTTCTA^AGGGCACA^GCTTGGCCAAGGGCGA^ 

TCCGAGCTCGGTACCAAGCTTGGGTCTCAA 1 
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TGTGAATTCGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATT 
CGCCCTTACATCAGCTTCTGAGCCGTGGTTACCACTTCGATGAGCGCTCCTTCAGAGAAGTGGTC 
TTCCAAAAGAAGGCTGCAGACACGGCTGTCGGCTGGGCGCTGGGCTACATGCTGAATTTGACTAA 
CCTGATTCCTGCCGACCTCCCCGGACTACGCAAGGGCACCCACTTCAGCTCCTGGGTCCGCTCTC 
CTGCTGCTCTTCACAGTCCTGATCTTGGCGGCGCTGGTCCTGCTCCTGCGCCAGGATGTCAGGTC 
TCAGCCTGTGACTCAGGGTGAGGTCCATTCGGAGTGGGACTTTTGTTC 

CACACTGGCGGCCGTTACTAGTGGATCCGAGCTCGGTACCAAGCTTGATGCATAGCTTGAGTATT 
CTATAGTGTCACCTAAATAGCTTGGCCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTT 
ATCCCGCTCACAATTCCACACAACATACGAGCCCGGAAGCATAAAAGTGTAAAAGCCTGGGGTGC 
CTAATGAAGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTGGCCGCTTTTCAGTCGGGAAAC 
CTGTCGTGCCAGCTTGCATTAATGAATTCGGNCAACNCCCCGGGGAGAAGGCGGGTTTGCGTATO 

TGGGCCGCT 



NNNNNNNNNNTTCTACATGATTACGANTTTAATACGAC 
GCCAAGAATTCGGCACGAGGGCCGCAATGTTTTTTTTTTTTT^ 

GGTC AAAC AC AATG ATTT ATT AAAAAT AAAAC GT AAAAATG ATTTTTGT AC AT ATGCTTC C AAAT 
TTCAGGCATGGGATCCAAGTAGGTTTCATAGAAAACGCTGTAGCCAGGTATCAAGTCCTTACAAC 
AAAGTAAACTACCCTCCCACCCAACCCCCACCCCGGTTTTGCTACAGAATCAGCAAGTTCAGCCC 
CCCGCCCGCCNCCCCCCCAAAAAACACAAATTAAAACGACACATCTTGNTAGTNTAAAAACNACC 
NAGGTCCAAGTAATNATAAAAAAATAGAGTCCNTCAATGACTGTAACACNAAAATGTGTGTGTGG 
GGCCGAGTCCACCTTCCGGGGGGACCGGGACGGGCAAGCAANCGGGGGGTCCCCCCCCCCGGGGT 
GAGCGGCTCTNCCGGGGGCACCTGGGGTGGGCNCCGAAGGCCAAGGAAGCCCCCCCTNCCGCCCC 
CGCCGNTTGGCATTCGGNAACCCGGCTTTTNAAAC^AACCTCC 
CCGCCGNAACCGCCNCITIGTACNGGN^ 
NTGANGGTNNATTTTTNCCN 



"5 
c a 

o i-t 

•H to 

tn 

c 
o 



CACTATAGAATACTCAAGCTATGCATCAAGCTTGGNCCGAGCTCGGATCCACTAGTACCGGCCGC 
CAGTGTGTTGGAATTCGCCCTTTGAAGCTTT^ 

CAACGTAAAGAACGTGTCTGTCAAAGACGTTAGACGTGGCAATGTTGCTGGGGACAGC AAAAATG 
ACCCACCAATGGAAGCAGCTGGCTTCACTGCTCAGGTGATTATCCTGAACCATCCAGGCCAGATC 
AGTGCTGGCTATGCCCCTGTTCTGGACTGCCACACGGCCCACATAGCATGCAAGTTTGCCGAGCT 
TAAAGAGAAGATCGATCGTCGTTCTGGTAAGAAGCTGGAAGATGGCCCCAAATTCTTGAAGTCTG 
GTGATGCTGCCATTGTTGACATGGTCCCTGGCAAGCCCATGTGTGTTGAAAGCTTCTCTGACTAC 
CCTCCACTTGGTCGTTTTGCTGTTCGTGACATGAGGCAGACAGTTGCTGTGGGTC 
CGTGGACAAGAAGGCTGCAGGAGCTGGCAAAGTCACCAAGTCTGCCCAGAAAGCTCAGAAGGCTA 
AATG AATATTATCCC TAAC ACCTGCC AC C CC AGTCTTAATC AAAGGGCGAATTCTGCAGATATCC 
ATCACACTGGCGGCCGCTCGAGCATGCATCTAGAGGGCCCAATTCGC 



TATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC 
GGCACGAGGGCTTCCTCTGCTAGGCTCCGAGCTGTCTGTTTCTTGGGCCCTGTGTATAGGTGCCG 
GCCGCCCCTGGCGTTC^^CACCGCTTCCAGGTGCCGGCCGTGTCTGCCATGGACGACTATGCGGT 
TTTGTCCGATACTGAGCTGGCCGCAGTGCTACGCCAGTACAACATCCCGCATGGGCCTATTGTGG 
GCTCCACTCGCAAGCTCTACGAAAAGAAAATCTTCGAGTACGAGACCCAGAGAAGGAGGCTTTCG 
CCCCCCAGCTCGTCATCGTCTTCATTCTCCTATCGGTTCTCAGACTTGGATTCAGCCTCCGTGGA 
CTC AGAC ATGTATG ATCTGCC C AAAAAGGAGG AC GC CTT ACTTTACCAGAGCAAGGACTATAATG 
ATGACTACTATGAGGAGAGCTATTTGACTACCAGGACATACGGGGAGCCCGAGTCTGTGGGCATG 
TCCAAGAGCTTCCGCCGGCCAGGGACCTCACTTGTAGATGCTGATGATACCTTCCATCACCAGGT 
GCGTGATGACATTTTCTCTTCTTCAGAAGAAGAANGCAAGGATAGGGAACGCCCCATCTATGGNC 
GANACAGTGCCTTACAGANCATCGCACACTACCGCCCATT 



8g 

If 5 

9tt 



•a u 



o 



in 



TTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATT 
CGGCACGAGGCACTGACAGAAGCACATGGGTGGAGGAAAGACCTCTGCGACTGGCTGATTGATCC 
CTGCTGAAAGCCGAGGACCTTGTCCACAGACAGGAACAGTTCTCTTCATGAATGAAGGTCAGAGA 
CAAGTGGGTGCTGCCATGGTGGACAACACAATGTCATCTAGAATGGCTGAACCTCTACCCCCCCA 
GCACATCAGCGCCACAGATGCCTTTGCCATCTCTTGGATGCCCTGATAAAGCCAACAACTGTGAG 
TACTATTCATTGCCCAGGAAAACAGGAGGGAAGAGATTCAGTGACATGGGGCAGTGACAAAACAA 
ATAAAGTGGCTCGGGAAATGGCTATACAGGAGCCTATCCTGGTTACAGGCCTGCAAGAGACAGCC 
ACTGAGAACTAGGATTAAACTAAGGGGATGGCCTCACTTAGAAAAGGCCCAAGTTGTTTTAAAAG 
ATAAAAAGACNATGACACACTTGAGGGGAAGGCTATACTCCCCAGAAAACAAAGAAAAAGACTCA 
CTTTGCCAAATACAGAAATGGACTCATTTAGGAGATAAAAANTTTGTCCCAAGTAGTTAAGGTA^ 
ATGTAATANACTTTAAAAATTTTANCCCAAGAGAGACAGTAAAAA 
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CP u 
o 

«— I 4J 

id u 



■■■■MB 



ITTATGACATGATTACG* 
CGGCACGAGGGGCAGCACTGTGTCCC 



<l> *H 0) u 

*J X! *J i 



TGGTTCATCACCTTCATGGCCGCi 
IcTTCGGGCCC 



wTAGCAATCAGCACCCTCCAAAGGTATCCAGGCCTAGAGG 
TGCGCCATGAGGACGGACACCCGCTGAAGCTTTGCTGGTC 



! s § ^ I 

u 



jCGGCTCCAGAC^CTGCT^TC^^^ 



IraArCTCTGAGGATGGAC 



8 



Igcctctgaagtattcttgaacatt 
|attcttagttcagtgcaaaaaga< 



!tcgctgaagatagtgcaggccctggggagtttctcctt I 



^AGIGGGTAATGTTTTGTTTTTTTTTTCTTGAAACAGGGCCCAGAA I 
TCACAACCTGGTAAATCCTGGGCCTACAGGTGTGCGCCGACATGTC 1 
^A^GCACAGTGGGGAGATGATTGTAACATTAAGTCTGTC 



' 0> 
T 1 tfl 



itgaggagcagcctctacaa' 
Igtcctataaatgcttgggggg 



TGGCCTTGAACTCTGATCC 



TGCTTGTCAAAGAGATATTCTGCCATGCCA< 
GTGGTCACCCAGTTCTTTAATGGATTTCACC 1 



|ATAAGTAGGGATCATTCTTGTCAGTA< 



^^JTGCTCATTCAGGTAATGCGTCTCAATGAAGTCAC 
.GCCAGTTTGTGAAGTTCCAGTAGTGATTGATTCACACTC 



icggtcaaaataacaagacatggacagatal 
Igttgatggcaggatcccgcgat; 



irtunvjn. j. ^GACGTAGGAGGC ATAC AACTC CAGGTTGATCTGGCG 
^AAGGGCGAATTCCAGCACACTGGCGGCCGTTACTAGTGGATCC 



20/73 



WO 03/100030 



PCT/US03/06196 



3 



TCTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAAT 
TCGGCACGAGGGTACCCAGTCTCAGGACAGCTGTTCACTCCAGGCCTCCGACTCTGAGCCCGTTG 
GTCTTTGCCAAGGTTCACTGATTAAAAGTCCCGGGGTCCCTCCTCAACGCTTTAAAAAGACTGTC 
ACTGTGTCGTGCGAGTTTTTCGAATCTCAGGACCAGGTCCCTGGAGGTGAGAACCCTGCTGATAC 
CCAAGATGCTAAGAAACTCCCTCAGAAAAACACAGCCCCTACCAGCTCACCCTCCATAACTGCAC 
CAAGAGGATCTATCCAACACCTCCCTGAGCAGGAGGAGCCTGAAGACTCCAAGGGAAAGAGTCCT 
GAGGAACCCTTTCCTGTGCAGCTGGATCTAACCACAAACCCACAGGGTGACACACTGGATGTCTC 
CTTCCTCTACCTGGAGCCTGAGGAAAAGAAACTGGTGGTCCTGCCTTTCCCTGGGAAGGAACAGC 
GCTCCCCTGAGTGCCCGGGGCCCGAAAAGCAAAGAACCCCCTGATGCTCCCCGCTGAGACTCACT 
AGCAGGGTTCCACGGGGTACGGTCCCCTGCAGTAGATGGGAGGTGGTGGGCATTGGGAANGCACA 
GACAATCAAATGTAGACCGGCTAATAAAGTGTGT 



u 



GTATGCATCAAGCTNGGTACCGAGCTCGGATCCACTAGAACGGCCNGCCAGTGTGCTGGAATTCG 
CCCTTCGCGGGATCCTTGGGCTGGGCAATGAGAAGATTCATCTGATAAGCATGCAGTCCACCATC 
CCATACGCACTGAGAATACAGCTCAAAGACTGGAGTGGCAGGACCAGCACCGCGGACTATGCCAT 
GTTCAGGGTGGGTCCTGAATCCGACAAATACCGCCTGACCTATGCCTACTTCATTGGCGGAGATG 
CCGGGGATGCCTTCGACGGCTACGATTTTGGTGATGATCCCAGTGAC 

AACGGCATGCACTTCAGTACCTGGGACAATGACAACGACAAGTTCGAAGGCAACTGTGCTGAGCA 
GGATGGATCTGGCTGGTGGATGAACAAGTGTCACGCTGGCCACCTCAATGGAGTTTATTACCAAG 
GTGGCACTTACTCCAAGTCATCTACTCCTAACGGTTATGACAATGGCATTATTTGGGCCACCTGG 
AAAACCCGCTGGTATTCCATGAAGGAAACCACCATGAAGATAATTCCCTTCAACAGACTCTCCAT 
TGGAAAGCTTGGCCAAGGGCGAATTCTGCAGATATCCATCACACTGGCGGCCGCTCGAGCATGCA 
TCTAGAGGGCCCAATTCGCA 



is 

fa 



2 



TTGCGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATT 

CGCCCTTATCGCGTCTAGATGCCCTTTGGGAGATCTTCTCTAGCATTCCACCAGCAGCGAGGAAG 

TAACCTTGTCCTCGGTCGCCCGCACTCACAGCTCCAACTGATGACAGCTAGCTGAAAGTCTCTCT 

TGTATAAGTTTAACTACTTATACATGGTTTTGATTCTGTTATTTTTCTAAA 

CTCTGGATCCAAAATGTGGCATTTTTCTGAGAATGAAGATTC 

TACTTGGTGTTCAAATTGGGATGTATGTTCCTAGAAG 

ATTGGAAGAGAAGTGTCTGAGGAGGGAGGGCTCCCAAGACACTGAGACTGGCTATCCTTCCTGCC 
AGAATTCCTGTCCAGACTGAATTGCAATATGCTAATCTCATTTATAGAGAAAGTGCATAAAAGCT 
ATATTTTGAAGAATGAGTGGTTTCAAAAGAAAACTTCTGCCCTCCCTGTC 

CACACTGGCGGCCGTTACTAGTGGATCCGAGCTCGGGACCAACTTGATGCATAACTTGGAGTATT 
CTATAAGTGNCCCTAAAATAGCTTGGCGTAATCN 



3 



GGGNTCCACTAGTANCGGCCNGCCAGTGTGCTGX3AATTCGCCC 

TGCCTTTCGCCTTTGAGACAGTGTCCAGCTGGNAGCTGGAAGCNTGGTATGAGGATCTGCAGGAG 
GTCCTGTCCTCAGATGAAATTGGGGGCACCTATATCTCATCCCCAGGAAACGAAGAGGAAGAATC 
AAAAACCTTCACTACTCTTGACCCTGCATCCCTAGCTTGGCTGACTGAGGAGCCAGGGCCAGCAG 
AGGTCACAAGCACCTCCCAAAGCCCTCGCTCTCCAGATTCCAGTCAGAGTTCTATGGCTCAGGAG 
GAAGAAGAGGAAGATCAAGGAAGAACTAGGAAACGGAAACAGAGTGGTCAGTGCGCAGCCCGGGC 
TGGGAAACAGCGCATGAAGGAGAAGGAGCAGGAGAATGAGAGGAAAGTGGCACAGCTTGCTGAAG 
AGAACGAGCGGCTCAAGCAGGAAATCGAGCGCCTGACCAGGGAGGTAGAGACCACACGGCGGGCT 
CTGATCGACCGCATGGTCAGTCTGCACCAAGCATGAACTGTTGGCATCACCTCCTGTCTGTCTCT 
CCCGGAGTGTACCCAGCACCATCACGCCAGTGCCAAGCATGTAATCTCCAGTGCACATGCTGAGG 
AGGAAGCTTGGCCAAAAGGGCGAATTCTGCAGATATCCATCACACTGGCGGCCGCTCGAGCATGC 
TTT AGGGGNC C AATC 



TTNGGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATT 
CGCC CTTCGCGTCTAGAGACTTTGGAGGAATTCTCGGC CGC AG AGC AGAAG ATCG AAAGG ATGGA 
CACGGTGGGCGATGCCCTGGAGGAAGTGCTCAGCAAGGCTCGGAGTCAGCGCACCATAACTGTCG 
GCGTGTACGAGGCAGCCAAGCTGCTCAACGTAGACCCGGACAACGTGGTCCTGTGCCTGCTGGCT 
GCGGATGAAGATGACGACCGGGACGTGGCTCTGCAGATCCATTTCACCCTCATTCGTGCTTTCTG 
TTGCGAGAACGACATCAACATCCTGCGGGTCAGCAACCCGGGTCGGCTGGCAGAGCTGTTGCTAC 
TGGAGAACGACAAGAGCCCCGCTGAGAGCGGGGGCGCTGCGCAGACCCCGGACTTACACTGTGTG 
CTGGTGACGAACCCACATTCATCACAATGGAAGGATCCTGCCTTAAGTC 

CCGGGAAAGTCGCTACATGGATCAGTGGGTGCCAGTGATTAATCTCCCCGAACGGTGATTCAAGC 
TTGGCCAAGGCGAATTNCAACACACTGCGGGCGGTACTATGGATCCAACTCGGACCAACTTGATG 
CATACTT 
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CTTTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAA 
TTCGGCACGAGGGATTTGCCCTGGCAAATGTACACACCTCATGCTAGCCTCATGAAACTGGAATA 
AGCCTTTGAAAAGAAATTTGTCCTTGAAGCTTGTATCTGATATCAGCACTGGATCGTAGAACTTG 
TTGCTGATTTTTGACCTTGTATTCAAGTTAACTG 

AGGATTTCCCGAGGCTGGCAAGGGTTCCTGAACTAGTTACCACTTCTTTTCTTGCCAGTCTAACA 
GGGTGGGAAAGTCCGAGCCTTAGGACCCAGTTTCTGTTCTGGTTTTTTCCCTCCTGACCTCCATG 
GGTTGTTACTTGCCTTGAGTTGGGAACGTTTGCATCGACACCTGTAAATGTATTCATCC 
TTTATGTAAGGTTTTTGTACTCAATTCTTTAAGAAATGACAAATTTTG 

GAGAACATTAGGCCCCAGCAACACGTCATTGTGTAAAGAGAAATAAAAGTGCTGCAGTAACNNCN 
TAAAAANNCCANCNNNAACNNANAAACCNATTC 

TAATTTTAGCTTTGGCNCTGGCCGCCGTTTTTACAACGTCNGNGACTGGNNAAAN 



41 



M 

3 u 



GATNANTTCAAGCTCGNCCAANTTCACCAACCAGTTTGGGGTAGCGCCCTCACCANCCAACTTCA 

TCAAGNCAGGTAAGCANCCGCTTTCATCCATGTGCCCCTCAATCATCGTGGATAAGAACGGCAAG 

GTTCGGATGGTGGTTGGAGCCTCGGAAGGTACCCAGATCACCACGTCTGTC 

CAACAGCCTGTGGTTCGGGTATGATGTGAAGAGAGCTGTGGAGGAGCCCCGTCTTCACAACCAGC 

TTTTGCCCAATACCACAACAGTAGAGAAAAATATTGATCAGGTGGTGACTGCAGGTCTGAAGACT 

CGGCACCACCATACAGAGGTCACACCCGACTTCATCGCTGTGGTTCAGGCCGTCAAGCTTGGCCA 

AGGGCGAATTCTGCAGATATCCATCACACTGGCGTGCCGCTCGAGCATGCATCTAGAGGGCCCGT 

TCANCAGGTAACAAA 



TGCGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTC 
GCCCTTATCGCGGGATCCTGGGCACCTTTCTTCAGTGAGCAA 
ACTTTGATGGCTCCAGAATTTTTAATGAAAGCAAGACTGTTGCTC 
CAGATTTTATAATTTTTTTATTACTC^ 

ATCTCCACACTGTAGTCTTCACCTTGATTGGCCTAGTGCCTGAGGGTGGAGACCACGCCCTGTCC 
AGACACATGCCTTCTTTGCCAAGCTAATCTGTAGGGCTGGACCTTTGGCCAAGGACACACTAATA 
CTGAACAATGAGCTAGGAGGCTTTACCGCAGGAGGCGGTAGCTGCCACCCACTTCTGCAGGCCTG 
GATCTCGACACCATAGGGGTCCAGGCTCCATTTAGGATTCGCCCATTCCTGTCTCTTCCAACTCA 
ACCAACCACTCGATTAATCTTTCCTTGCCTGAGACCAGTTGAAAGCACTGGAGTGCAGGGAGGAG 
AGAAGCTTGGCCAAAGGGCGAATTCCAGCACACTGGCGGGCGGTACTAGTGGATCCGAGCTCGGA 
CCAAACTTTGAGCATAACTTGAGTATTCTN 
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i w 

it 
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GCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCGCC CTTGCGC 
GGATCCTGGCATGTTCTTCAACCCTGAGGAGTCTGAGCTGGACCTAACCTATGGCAACAGATACA 
AGAATGTGAAGCTCCCTGATGCCTATGAACGCCTCATCCTGGATGTCTTCTGTGGGAGCCAAATG 
CACTTTGTCCGTAGTGATGAACTCAGGGAAGCCTGGCGTATCTTCACACCATTGCTGCACAAGAT 
TGATCGAGAGAAGCCCCAGCCCATCCCGTATGTCTATGGCAGCCGAGGTCCCACAGAGGCAGATG 
AGCTGATGAAGAGAGTGGGCTTCCAGTATGAGGGAAGCTTGGCCAAGGGCGAATTCCAGCACACT 
GGCGGCCGTTACTAGTGGATCCGAGCTCGGTACCAAGCTTGATGCATAGCTTGAGTATTCTATAG 
TGTCACCTAAATAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCT 
CACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGA 
GCTAACTCACATTAATTGCGTTGCGCTCACTGGCCGCTTTNCAGTCGGGAAACCTGCGTGCCN 



7 « 
So 



GAATNGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCGCCC 
TTNGCGGGANCCCTGACTGGAATCCCTCCTGCTNCCCGTGGGGTACCCCANATTGNANNCACC^ 
NGANATACATGTGAATGGNATTCOTNGAGTGGCANNTGAAGACANAGGGACAGGNAACW 
NANTCACCATCACCAATGACCAAAACCGCCTGACCCCTGAANAAATTGAA^ 
GCCGANAANTNNGCTGAGGANGACNAAANGCTCANAGA 

AAGCTATGCTTACTCTCTTAAGAACCANATCCGGANATNAAGAGANGCTGGGAGGTAAGCTGNCT 

NCTGAANATAANNAGACCATGGAGAAAGCTGTAGAGGAAAAGATCGAATGGCTGGAAAGCCACCA 

GGATGCAGACATTGAACACTNTAAAGCTNNNAAGATNGAACTAGACNANATTGTTCANCCA^ 

TGAGCAAACTCTATNGAAGTGGACGCCCTCCCCCAACTGGGGANGAAGAAGCTTGGCCCAGGGCG 

AATTCCATCACACTGGCNGGCGK5TACTATTGGATCGATCTCGNTCCAACCTGANGCATAGCTTGA 

GTNTTCTATNNTG 
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Glutathione S- 
transferase Yb2 
subunit 


M13590 


C^TNCGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAA^ 
"GGCACGAGGCCAGCACGATGCCTATGACACTGGGTTACTGGGACATCCGTGGGCTGGCTCACGC 
^ATTCGCCTGTTCCTGGAGTATACAGACACAAGCTATGAGGACAAGAAGTTCAAACTGGGCCTGG 
icTTCCCCAATCTGCCCTACTTAATTGATGGGTCACACAAGATCACCCAGAGCAATGCCATCCTG 
^GCTACCTTGGCCGGAAGCACAACCTTTGTGGGGAGACAGAGGAGGAGAGGATTCGTGTGGACGT 
rTTGGAGAACCAGGCTATGGACACCCGCCTACAGTTGGCCATGGTCTGCTACAGCCCTGACTT^ 
\GAGAAAGAAGCCAGAGTACTTAGAGGGTCTCCCTGAGAAGATGAAGCTTTACTCCGAATTCCTG 
3GCAAGCAGCCATGGTTTGCAGGGAACAAGATTACGTATGTGGATTTTCTTGGTTACGATGTCCT 
rGATCAACACCGTATATTTGAACCCAAGTGCCTGGACGCCTTCCCAAACCTGAAGGACTTCGTGG 
CTCGGTTTGAGGGCCTGAAAAAGATATCTGACTACATGGAAGACGGGCCGCAAGCTTTATTTCCT 
I^AGTGAGGGGTTAATTTTAGCTTNGCACTGGGCCGCCGTTTTACAACGTCGTG 


Glyceraldehyde 3- 
phosphate 
dehydr og ena se 


M17701 1 


AATTGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCGCCCT 
TCGCGGGATCCATCCTGCACCACCAACTGCTTAGCCCCCCTGGCGF^GGTCATCCATGACAACTT 
TGGCATCGTGGAAGGGCTCATGACCACAGTCCATGCCATCACTGCCACTCAGAAGACTGTGGATG 
GCCCCTCTGGAAAGCTGTGGCGTGATGGCCGTGGGGCAGCCCAGAACATCATCCCTGCATCCACT 
GGTGCTGCCAAGGCTGTGGGCAAGGTCATCCCAGAGCTGAACGGGAAGCTCACTGGCATGGCCTT 
CCGTGTTCCTACCCCCAATGTATCCGTTGTGGATCTGACATGCCGCCTGGAGAAACCTGCCAAGT 
ATGATGACATCAAGAAGGTGGTGAAGCAGGCGGCCGAGGGCCCACTAAAGGGCATCCTGGGCTAC 
ACTGAGGACCAGGTTGTCTCCTGTGACTTCAACAGCAACTCCCATTCTTCCACCTTTGATGCTGG 
GGCTGGCATTGCTCTCAATGACAACTTTGTGAAGCTCATTTCCTGGT 

GGCGAATTCCAGCACACTGGCGGGCGGTACTAGTGGATCCGAACTCGGTCCAAACTTGATGCATA 
GCTTGAGT 


Glycine 
methyl transferase 


00 

o 

r* 

r4 
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Stgacatgattacgaatttaatacgactcactatagggaatttggccctcgaggccaagaattc 

GGCACGAGGCAGCCTTTGACAAGTGGGTCATTGAAGAAGCCAACTGGTra 
GTGCCAGCAGGAGATGGCTTTGACGCTGTCATCT^ 

CAGC AAAGGTGACC AGAGTGAGC ACCGGC TGGCGCTAAAGAAC ATCGC AAGCATGGTGCGGC C CG 

GGGGCCTGCTGGTCATCGACCACCGCAACTACGACTACATCCTCAGCACGGGCTGTGCACCCCCA 

GGGAAGAACATCTACTATAAGAGTGACCTGACCAAGGACATTACGACGTCAGTGCTGACAGTAAA 

CAACAAAGCTCACATGGTAACCCTGGACTACACAGTGCAGGTGCCAGGTGCTGGCAGAGATGGCG 

CTCCTGGCTTCAGTAAGTTTCGGCTCTCTTACTACCCACACTGTTTGGCGTCTTTC 

GTCCAAGAAGCCTTTGGGGGCAGGTGCCAGCACAGCGTCCTOTGTGACTTCAAGC 

CGGCCAGGCCTACGTTCCCTGCTACTTCATCCACGTGCTCAAGAANACAGGCTGANCCTGGCTNC 

NGCTTCCACCCTAANAACATCCCTACCACAGATATTGCAGANAT 


Heme binding 
protein 23 


D30035 1 


CATTGAAAC GGNNNCTCTAGATGC ATGCTCG AGC GGCCGC C AGTGTG ATGG ATATC TGC AGAATT 
CGCCCTTCGCGGGATCCCAAAGCCACGGCTGTTATGCCCGATGGACAATNCAAAGATATCAGCCT 

AAGTGATTACAAAGGAAAATATGTTGTATTCTTTTTTTACCCTCTTC 

CC AC GGAGATGATTGCT1TCAGTG ATAGAGCNGAANAATTT AAGAAACTC AAC TGCC AAGTGATT 
GGAGCTTCTGTGGATl^CACTTCTGTCATCTGGCATGGATTAACACACCCAANAAACAAGGAGG 
ATTGGNACCCATGAACATTCCCTTGGTATCAGATCCCAAGCGCACCATTGCTCANGATTATGGAG 
TCTTAAAAGCTGATGAAGGTATCNTCTTTCANGGGCCTNT^ 

TCNCAKATTAACCGATAAATGATCTTTCTGTTGGG _ 






TTNTGACATGATTCGAATNNAANACCGACTCACTATAGGGAATTTGGCCCTC 

CGGCACGAGGACCCGCTACCTGGGTGACCTCTCAGGGGGTCAGGTCCTGAAGAAGATTGCGCAGA 

AGGCCATGGCCTTGCCAAGCTCTGGGGAAGGCCTGGCTTTTTTCACCTTC 

CCCACCAAGTTCAAACAGCTCTATCGTGCTCGCATGAACACTCTGGAGATGACCCCCGAGGTCAA 
GCACAGGGTGACAGAAGAGGCTAAGACCGCCTTCCTGCTCAACATTGAGCTGTTTGA 
AGGCACTGCTGACAGAGGAACACAAAGACCAGAGTCCCTCTGCAGAGACGCCCCGAGGAAAATCC 
CAGATCAGCACTAGTTCATCCCAGACACCGCTCCTGCGATGGGTCCTCACACTCAGTTTCCTGTT 
GGCGACCGTGGCAGTGGGAATTTATGCCATGTAAATGCAGTGTTGGCCCCCAGAGGCTGTGAACT 
CTGTCTCATGTAGCCTTCTCTCTGCAGGGGAGAATCTTGCCTGGCTCTCTTTTCTTGGGCCTCTA 
AGAAAGCTTTTGGGGTTCCTCGCC C CC I rCCTGTv* l L.W I 1 1 x \~ aa- j. a x vjwnnw« 
AGATGCCTGGCACATTTCT \ _ _ - 


Heme oxygenase 


o 

CO 

tn 

(N 
i-H 
O 

S 


Hemoglobin alpha 1 
chain 


1 NM.013096 I 


GI^CTATOTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATT^ 

CAAGAATTCGGCACGAGGCAGGAAGCAATCATGGTGCTCTCTGCAGCTGACAAAACCAACATCAA 

GAACTGCTGGGGGAAGATTGGTGGCCATGGTGGTGAATATGGCGAGGAGGCCCTACAGAGGATGT 

TCGCTGCCTTCCCCACCACCAAGACCTACTTCTCTCACATTGATGTAAGCCCCGGCTCTGCCCAG 

GTCAAGGCTCACGGCAAGAAGGTTGCTGATGCCCTGGCCAAAGCTGCAGACCACGTCGAAGACCT 

GCCTGGTGCCCTGTCCACTCTGAGCGACCTGCATGCCCACAAACTGCGTGTGGATCCTGTCAACT 

TCAAGTTCCTGAGCCACTGCCTGCTGGTGACCTTGGCTTGCCACCACCCTGGGGATTTCACACCT 

GCCATGCACGCCTCTCTGGACAAATTCCTTGCCTCTGTGAGCACCGTGCTGACCTCCAAGTACCG 

TTAAGCCACCTCCTGTCGGGCTTGCCTTCTGACCAGGCCCTTCTTCCGTCCCCTGAACCAGTCTT 

TGAATAAAGCAGAAGTAGGAAAGAAAAAAAAAAAAAAAAAAAAAAAAATCTCGCG 
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CTATG AC ATGATTACGAATTTAAT AC GACTC ACT ATAGGGAATTTGGC CCTCG AGGCC AAG AATT 
CGGCACGAGGGACTCAGGAAGCAATCATGGTGCTCTCTGCAGATGACAAAACCAACATCAAGAAC 
TGCTGGGGGAAGATTGGTGGCCATGGTGGTGAATATGGCGAGGAGGCCCTACAGAGGATGTTCGC 
TGCCTTCCCCACCACCAAGACCTACTTCTCTCACATTGATGTAAGCCCCGGCTCTGCCCAGGTCA 
AGGCTCACGGCAAGAAGGTTGCTGATGCCTTGGCCAAAGCTGCAGACCACGTCGAAGACCTGCCT 
GGTG C CCTGTC C ACTCTG AGCG AC CTGC ATGCC C AC AAACTGCGTGTGGATCC TGTC AACTTC AA 
GTTCCTGAGCCACTGCCTGCTGGTGACCTTGGCTTGCCACCACCCTGGAGATTTCACACCCGCCA 
TGCACGCCTCTCTGGACAAATTCCTTGCCTCTGTGAGCACTGTGCTGACCTCCAAGTACCGTTAA 
GCCGCCTCCTGCCGGGCTTGCCTTCTGACCAGGCCCTTCTTCCCTCCCTTGCACCTATACCTCTT 
GGTCTTTGAATAAAGCCTGAGTAGGAAGCAAAAAAAAAAAAAAAAAAAAA 
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GCGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCG 
CCCTTATCGCGGGATCCCTTCAAGTACTAAGGGCATGAAATATCTTGCC^ 
ACAGAGACTTAGCTGCAAGAAACTGCATGTTGGATGAAAAATTCACTGNCAAGGGTGCTGAT^ 
GGTCTTGCCAGAGACATGTACGACAAAGAGTATTATAGCGTCCACAACAAAACGGGTGCGAAACT 
ACCGGTGAAGTGGATGGCTTTGGAGAGTCTGCAGACGCAAAAGTTCACCACCAAGTCAGACGTGT 
GGTCCTTCGGTGTGCTTCTCTGGGAGCTCATGACGAGAGGAGCCCCTCCTTATCCTGACGTGAAC 
ACATTTGATATCACTATATACCTGTTGCAAGGCAGAAGACTCTTGCAACCAGAGTACTGTCCAGA 
CGCCTTGTATGAAGTGATGCTAAAATGCTGGCACCCCAAAGCAGAAATGCGCCCATCGTTTTCTG 
AACTGGTCTCCAGGATATCCTCAATCTTCTCCACTTTCATTGGCGAGCACTATGTCAAG 
CAAAAAGGCGAATTCCAGCACACTGGCGGGCGGTACTAGTGGATCCGAGCTCCGG 



QJ O 
JJ 4J 



GCGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCG 
CCCTTATCGCGGGATCCCTGCGGTCACAGGTGCAGGTGAGCCTGGAGGATTACATCAACGACCGG 
CAGTATGACTCTCGGGGTCGTTTTGGAGAGCTGCTGCTGCTCCTGCCCACTCTGCAGAGCATTAC 
CTGGCAGATGATCGAGCAGATCCAGTTCATCAAGCTCTTTGGCATGGCCAAGATTGACAACCTGC 
TGCAGGAGATGCTGCTTGGAGGGTCTGCCAGTGACGCGCCCCACGCCCACCACCCCCTGCACCCT 
CACCTGATGCAAGAACACATGGGCACCAATGTCATAGTTGCCAACACGATGCCCTCTCACCTCAG 
CAATGGACAGATGTCCACCCCTGAGACTCCACAGCCATCACCACCAAGTGGCTCTGGATCTGAAT 
CCTACAAGCTCCTGCCAGGAGCCATCACCACCATCGTCAAGCCTCCCTCTGCCATCCCCCAGCCA 
ACGATCAAGCTTGGCCAAAGGGCGAATTCCAGCACACTGGCGGCCGTTACTAGTGGATCCGAGCT 
CGGTACCAAGCTTGATGCATAGCTTGAGTATTCTATAGTGTGACCTAAATAGCTTGGCGT 
TGGCATAGCTGGTTCCTGTGTGAAATTGGTATCCGCTCACA 



I JJ 

•H U 
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CACCCTCATGGNCAGCATCCCCATGGACACCACCCCCATGGTCACCATCCTCATGGTGACCATCC 

CCATGGACACCACCCCCATGGACATGATTTCCTTGACTATGGACCTTGTGACCCACCCTCCAATA 

GCCAAGAACTCAAGGGTCAAGTATCATCGGGGACATGGTCCACCACACGGACACTCAAGGAAAAG 

AGGGCCAGGTAAAGGACTCTTTCCTTTNCACCAACGACAAATCGGATATGTCTACCGACTCCCTC 

CACTGAATGTAGGTGAAGTTCTCACTCCTCCTGAAGCCAATTTGCCCATCTTCTC 

GCAACAGACCCCCACAACCAGAGATTCNGCCCTTCCCTCANACAGCCTNAAAGTCCTGTNCAGGG 

AAATTTGAGGGTAAGTTTCCACAAGTTCCAACTTTTTTTGAACATACG 

TGATTTCTTGGTAGGGGAAAGAGTCAATATTCTGAATAAATAAAATATGACCAGTTAGAAATG^ 
AAANGGGGGGGNGANANiOTAGGNGGGGNGGGGGGGGGGGGGNTTTT^ 



s 



GGGGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTC 
GCCCTTCGCGGGATCCTGGTTTGTGAAGCTGTCATTCCAGCCAAGGTGGTGAGAGAAGTATTAAA 
GACGACTACGGAAGCTATGGTTGACGTAAACATTAACAAGAATCTTGTGGGCTCTGCCATGGCTG 
GTAGCATAGGAGGCTACAACGCCCATGCTGCCAACA1CGTCACTGCCATCTACATTGCATGTGGC 
CAGGATGCAGCACAGAATGTGGGGAGTTCAAACTGTATTACGTTAATGGAAGCAAGTGGTCCCAC 
AAATGAAGACTTATACATCAGCTGTACCATGCCGTCTATAGAGATCGGAACCGTGGGTGGTGGGA 
CCAACCTTCTACCTCAGCAAGCCTGCCTGCAGATGCTAGGTGTTCAAGGGGCGTGCAAAGACAAT 
CCTGGAGAAAATGCACGGCAGCTTGCCCGAATTGTGTGTGGCACTGTGATGGCTGGTGA 
CTTGATGGCAGCATTGGCAGCAGGACATCTTGTCAGAAGTCACATGGTTCACAAAGCTTGGCCAA 
GGGCGAATTCCAGCACACTGGCGGCCGNTCTAGTGGATCCGAGCTCGGACCAACTTGATGCATAC 
TT — i*.^— 
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GAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCGCC 

CTTCGCGGGATCCTGCTGAAGATTTGGAAAAGGTGTTTATTCCTCATGGACTGATTATGGACAGG 

ACTGAAAGACTTGCTCGAGATGTCATGAAGGAGATGGGAGGCCATCACATTGTGGCCCTCTGTGT 

GCTGAAGGGGGGCTATAAGTTCTTTGCTGACCTGCTGGATTACATTAAAGCGCTGAATAGAAATA 

GTGATAGGTCCATTCCTATGACTGTAGATTTTATCAGACTGAAGAGCTACTGTAATGACCAGTCA 

ACGGGGGACATAAAAGTTATTGGTGGAGATGATCTCTCAACTTTAACTGGAAAGAACGTCTTGAT 

CGTTGAAGATATAATTGACACTGGTAAAACAATGCAGATTTTGCTTTC 

GCCCCAAAATGGTTAAGGTTGCAAGCTTGCTGGTGAAAAGGACCTCTCGAAA 

GCCAGACTTTGTTGGATTTGAAATTCCAGACAAGTTTGTTGTTGG 

AAGGGCGAATTCCAGACACTGGCGGNCGTACTAGTGGATCCGAGCTCGGACCAAGCTTGATGCAT 
ACTT 
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GCGAAWGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCG 
CCCTTATGCCAGATCACAGCACATTCACAGCTCCCCAGCATTTCACCAATGCATTGCTGTAGTGT 
CGTTTAAAATGCACCTTTTTATTTATTTATTTTTG^ 
TTTAATGAAATGCCAATATAATTTTTTAAGAAGGCAGT 
GAAAATTTTTTACTCATTTTTTTCATGTTTTACATGAAAATAATG 

AGTCACAATTGCACAATATATTTTCTTAAAAATACCAGCAGTTACTCATGCATATATTCTGCATT 
TATGAMCTAGTTTTTAAGAAGAAACTTTTTTTGG 

gctgttgatcttataatgattcttaaactgtatgg^ 

ATATAGAGAGATATGCTTATATCTGGAAGGTATATGGCATTTATTTGGATAAAATTCTCAATTG^ 
GAAGTTATCTGGTGTTTCTTTACTTTACCGG 
CCAGCACACTGCGGG 



'ggatttcgcccttatcgngggatccaaagcgto 

GCCCCCCCTTGCTGGGACGAACAGGCAGGTGAACGTTCTGCTCTACGACATGAACGGCTGCTACT 
CACGCCTCAAGGAGCTGGTGCCTACCCTGCCTCAGAACCGCAAAGTGAGCAAGGTGGAGATACTG 
CAGCATGTTATCGACTACATCAGGGACCTGCAGCTGGAGCTGAACTCTGAGTCTGAAGTCGCGAC 
CGCCGGAGGCCGGGGGCTGCCCGTCCGGGCCCCGCTCAGCACCCTGAACGGCGAGATCAGTGCCT 
TGGCGGCCGAGGCGGCATGTGTTCCAGCCGACGACCGCATCTTGTGTCGCTGAGGCGGCGCACTG 
AGGAACCAGATGGACTCCAGCCCTTCAGGAGGCAAGAGGAAAAAAAGTGCTCTCGGTTCCCCAGA 
GCAACCCGGGGAAAGACACTACCGCGGCCACGGGACTCTTGACGGATCTGTCCAGGGGGTAGAGG 
GTTGATCAACGGAGTCTCGCCCTCTCCAAGCTTGGCCAAAGGGCGAATTCTGCAGATATCCATCA 
CACTGGCGGCCGCTCGAGCATGCATCTAGAGGGCCNGTTTTCCAA 



S 

0. 



TATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCXCTCGAGGCCAAGAATTC 
GGCACGAGGGGAGCACTAACCAGGAAAATGGCAGACGGCTTCTCGCTTAATGATGCCTTAGCTGG 
CTCCGGAAACCCAAACCCTCAAGGATGGCCTGGTGCATGGGGGAACCAGCCTGGGGCAGGAGGCT 
ACCCAGGGGCCTCCTATCCTGGGGCCTACCCAGGACAGGCTCCTCCAGGGGGTTATCCTGGACAG 
GCTCCTCCTAGTGCCTATCCGGGCCCAACTGGCCCTAGTGCTTATCCTGGCCCAACTGCCCCTGG 
AGCTTATCCTGGCCCAACTGCCCCCGGAGCCTTCCCAGGGCAACCTGGGGGACCTGGAGCCTACC 
CCAGTGCTCCTGGGGCCTATCCTGCTACTGGCCCCTATGGTGCCCCGACTGGACCACTGACAGTG 
CCCTACGATATGCCCTTGCCTGGAGGAGTCATGCCTCGCATGCTGATCACAATCATGGGCACAGT 
GAAGCCCAACGCAAACAGTATCACTCTGAATTTCAAGAGAGGGAACGACATCGCCTTCCACTTTA 
ACCCCCGCTTCAATGAGAACAACAGAAGAGTCATCCGTGTGCAACACGAAGCAGGACAATTAACT 
GGGGAANGGAAAAAAAGACAG 
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NCNNNCGACTCCCANNTCTTATATGACATGATTACGAATTTAATACGACTCACTATAGGGAA 
GGCCCTCGAGGCCAAGAATTCGGCACGAGGGAACTGACTATAGCTCATTTCTTCTCTCTCCCATT 
TTGTGGGAGTGAGTTTCAAGTGATAATTACTAGAAACCTTTCAGTTTTACCTTTT^ 
TATTGACTTGTTACCTGGGTGTGATTCAGGAACTCTCAGGCTCATCTGGTGAACACTATTTTG 
TCTTAAGAGGCAGTTTGAGATGGTATCAACTTATATACAAGGAATTCNGAAACTCGAGCTCTGGG 
CACACCAGCTCAGGAAAGTCTTTGCTCTACCGCTGTTTTAAACATTTCAGAAGCCAGCATCCTGC 
CCCTCCGACCACTANGWTTTGTCTGAATAAAACAGGAAGTGATTCTTATCCCTGGTTCTCAGGGA 
AGGGATTAGCATTCAGTTCTTCTTGTTTACATTTTTACTAACTGCTC 

GTGCTCTTTGTTGACATGATCAACTCNTATTGTGATGNAAGCAACCTGNGGGCANGCGTTCAGGT 

CTGCCCTGTCTTTTAAATGATACTGAGATTTNCCNCTTTGGGTNGAAAGAGGCCATC 

GCCCNAAGCCNAATGTGGAACCACNCCCGANAATTTAAAGAAAATAAAl^NCAAAATGNAGGCNGG 

TTN 



5*2 



TTt^CGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAAT 
TCGCCCTTCGCGGGATCCGGAGCCTCGACCTCTGCATGCCCTCACCCGTGGCCAGGGAGCCTGTG 
TACTAGAACCTGCCGCACCCGCCACGAGCAGCTTGTCCGGTTCTCAGCATGAAGAGGCAAAGGCT 
GCTGTGGCCTCTGAGGATGAGCTTGCCGAGAGCCCAGAGATGACAGAGGAACAGCTGCTGGATAG 
CTTCCACCTCATGGCCCCATCCCGTGAGGACC^GCCCATCCTGTGGAATGCCATTAGCACCTACA 
GCAGCATGCGGGCCCGGGAGATCACTGACCTCAAGAAATGGAAGGAGCCCTGCCAACGGGAACTC 
TATAAAGTGTTAGAGAGATTAGCTGCCGCTCAACAGAAAGCAGGAGATGAGATCTACAAATTTTA 
TCTGCCAAACTGCAACAAGAATGGATTTTATCACAGCAAACAGTGCGAGACATCTCTGGATGGAG 
AAGCTGGGCTCTGCTGGTGTGTCTACCCATGGAGTGGGAAGAAGATCCTTGGATCAAGCTTGGCC 
AAGGGCGAATTCCAGCACACTGGCGGCCGGTACTAGTGGATCCGAGCTCGGTACCAACTTGATGC 
ATAGCTTGAGTATTCTATAGTGNCACCTAAATAGCTTGG 
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I^^^A^C^^ACCTAAA ^^.rTTr^GTAATCN 1 

Intcatgatggagnccccttnccacagatng 



Itcaacaagatagaagtcaagaccaaag 
agcacctctcaagcagagcacagacci 

ICTTCACCATGGAACCCGTGTCTTCCT^ 



lACCTGTCTTCCTAGGAAACAGCAATGGTCGGGACATAGTTGA 



iTCAACAAGATAGAAGTCAAGACCAAAGTG^AGTTro | 

TAAAGATG 
rGAGCAGT 

:gtggcac 
:tatttat 

| A TCAAAAGCTTGGCCCAAAN^ 

IcTCTCCCCTCTTCAGCTTCGTCCTGOBOTTCT^^ I 

V,tgcacctaaatagcttggcgtaatcatggcatactggttct 1 
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GCGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCG 
CCCTTATCGCGGGATCCTGGCTGCAATACCAGAAGAAGGCTCTTGTGTCAACTTCAAAGAAATGG 
TGTTTATTGACAACACACTTTACCTTATACCTGAAGATAATGGAGACTTGGAATCAGACCACTTT 
GGCAGACTTCACTGTACAACCGCAGTAATGCGGAGCATAAATGACCAAG1TCTCTTCGTTGACAA 
AAGAAACCCGCCTGTGTTCGAGGACATGCCTGATATCGACCGAACAGCCAACGAATCCCAGACCA 
GACTGATAATATATATGTACAAAGATAGTGAAGTAAGAGGACTGGCTGTGACCCTATCTGTGAAG 
GATGGAAGGATGTCTACCCTCTCCTGTAAAAACAAAATCATTTCCTTTGAGGAAATGAATCCACC 
TGAAAATATTGATGATATAAAAAGTGATCTCATATTCTTTCAGAAACGTGTGCCAGGACACAACA 
AAATGGAATTTGAATCTTCCCTGTATGAAGGACACTTTCTAGCTTCCCAAAAGGAAGATGATGCT 
AAGCTTGGCCAAGGGCGAATTCCAGCACACTGGCGGCCGTTACTAGTGGATCCGAGCTCGGACCA 
AGCTTGATGCATAGCTTGAGTATTCTATAGTGCACCTAAATAGCTTGG 
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TATGAATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTCG 
GCACGAGGCTGGTATAAAAGGGAATCACCATGCCCTCTACAGGGATGACTTCAGGAAAATGGTCA 
CTACTGAGTGCCCTCAGTTTGTGCAGAATAAAAATACCGAAAGCTTGTTCAAAGAATTGGACGTC 
AATAGTGAC AAC GC AATTAAC TTCGAAGAGTTC CTTGTGTTGGTGATAAGGGTGGGCGTGGC AGC 
TCATAAAGACAGCCACAAGGAGTAACAGAGCTTCTGGCCTGGGGCTGGGCCCTTGGATATGTCTA 
CAGAATAAAGTCGTCATATCTTANGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACATTGCGGCC 
GCAAGCTTATTCCCTTTAGTGAGGGTTAATTTTAGCTTGGCACTGGCCGTCGTTTTACAAC 
TGACTGGGAAAACCCTGGCGTTACCCAACTTAATCGCCTTGCAGCACATCCCCCTTTCGCCAGCT 
GGCGTAATAGCNAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAA 
TGGGACNCNCCCTGTAGCGGCGCATTAACGCGGCGGGGTGTGGTGGTTACGCCAGCGTGACCGCT 
ACACTTGCCNGCGCCCTAAGCGCCCGCTCCTTTCCTTGGGGGGGGGNGGGTTG 
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TATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC 
GGCACGAGGGGGAAGACCTTCCAGGCCGTGATGAGGTTCGACACCGATGTGGAGCTCACTTACTT 
CCACAATGGAGGCATCCTGAACTACATGATCCGAAAGATGGCCCAGTAGGTGCTGGCCTCTCAGG 
AGACCCGCGCTTGGTGCTAGACCCAATGAGGTACCAGGCCTCCGCTGGTGGAGGCCTGGCGAGCA 
GCCACCTCTACTTCTCGTGAGGGTGCTAGCAAGATGAGCAAGTGGGCCCTGCCATTCCTGGAGGC 
TCAGCGGCAGGAGTCTCTAGTTCGGTGATTTGTTAATCTTTTTATCCTTTO 
CTAGAATCATGGGAAGGTCCATAGTCCCAAAGAGAGCTACCTTCTCTTTAAAGTCACTCATCACC 
GGTCATTGATTTTTTTCACTCTGACTAATCTTCAGCAGAACTAGCCAGTATCTCAGAAGTGTCTC 
CTACCCTTTCTGTTACTCTGTCTGTCTGTGCTCAGTGACACCCTTCCCTGGAGAGCCCATTCCTC 
CGTGTATCACACCAATGGTAACGACATAGCTTCAGACTCTGTCACACTTCAATTCATAGTAATCG 
NGTGATCCCTTNCTTCCAAGTGAGCGAAAAACCTTGTGGCTAAGGGCG 



0) 

I 



TGCGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTC 

GCCCTTATCGCGGGATCGATAGGCTTCTTCCCCTGGGAATACTGATGGATTTTTTT^ 

TTTTTTTTGTACGACGTCAGGTGTTCAAACACTTCCTTGATAGCATCACTTTAAGA 

AGGACTGACTGAGGCAGATTGAGAATTCGTCTAGAACAGGTTTTTC 

TTTATTTTTTCCTGCTTTAGACTTGAAAAGAGACAGGC 

GGAACAAACTGAGCTATGTAGTCAGAATGTGACTGGTTGGATCTCATTAAAAGTATCAGATTGTG 

TGAAGTTGGAAGCTTACCAATCTTACTTTGTAAATTCTGATTTCTTTTC 

GCTGAACCACTTGTAGATTTGATTTCATTGTTGGTGTCTACT^ 

GCTAGATGAATACTTGAACCATAAAATGTCCAGTTAGAGCACTGTTTAGATTGGCCATAGAGTAC 
ACTGCCTGCTCGAGGGCCAAAAGGGCGAATTCCAGCACACTGGCGGGCGGTACTAGTGGATCCGA 
GCTCGGTACCAAGCTTGATGCTAGCTTGAGTATTCTAT 



o 
u 

Oi 



2 



GCGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCG 
CCCTTATCGCGGGATCCTGCTTCCACCTCGTCTGTCTTGTGGGCACCATATCTTTA 
GACATGAGTCCAGAGCAGACGGCCACGAGCGTGAACTGTTCTAGCCCCGAGCGACACACGAGAAG 
TTATGACTACATGGAAGGAGGGGATATAAGGGTGAGGAGACTGTTCTGTCGCACCCAGTGGTACC 
TGAGGATTGACAAACGAGGCAAAGTGAAAGGGACCCAGGAGATGAGGAACAGCTACAACATCATG 
GAAATCAGGACTGTGGCAGTTGGAATTGTGGCAATCAAAGGGGTGGAAA 

CATGAACAAAGAAGGGAAACTCTATGCAAAGAAAGAATGCAATGAGGATTGCAACTTCAAAGAAC 
TGATTCTGGAAAACCATTACAACACCTATGCATCAGCTAAATGGACACACAGCGGAGGGGAAATG 
TTCGTGGCCTTAAATCAAAAGGGGCTTCCTGTCAAAGGAAGCTO 

CACTGGNGGGCGTTCTATNGGATCCNGCTCTGNCCCCNCTNGATGCATATCTNGAGNATTCTATA 
TGGN „ 
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Liver fatty acid 
binding protein 


V01235 1 


TNGNNATTGGGCCCTCTAGATCCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTC 

GCCCTTAGTACCAAGTGCAGAGCCAAGAGAACTTTGAGCCCTTCATGAAGGCGATGGGTCTGCCT 

GAGGACCTCATCCAGAAAGGGAAGGACATCAAGGGGGTGTCAGAAATCGTGCATGAAGGGAAGAA 

AGTCAAACTCACCATCACCTATGGGTCCAAGGTGATCCACAATGAGTTCACCTTGGGGGAGGAGT 

GCGAACTGGAGACCATGACTGGGGAAAAGGTCAAGGCAGTGGTTAAGATGGAGGGTGACAATAAA 

ATGGTGACAACTTTCAAAGGCATAAAGTCCGTGACTGAATTCAATGGAGACACAATCACCAATAC 

CATGACACTGGGTGAAGGGCGAATTCCAGCACACTGGCGGCCGTTACTAGTGGATCCGAGCTCGG 

TACCAAACTTGATGCATAGCTTGAGTATTCTATAGTGTCACCTAAATAGCTTGGCGTAATCATGG 

TCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAG 

CATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTACTCCATTAATTGCGTTGCGCT 

CCCTTTTCAGTCNGGAAACCTG 


Low density 
lipoprotein 
receptor 


X13722 | 


GCGAATTGGGCCC^TAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCG 
CCCTTATCGCGGAATTCGACATGGCTGGCAGAGGGGACGAGGTGCAGCG^CACGGTGTGGGGTTC 
TTGTCCATCTTCCTCCCCATTGCACTGGTGGCCCCCCTTG 

GAACTGGCGGCTGAGGAACATTAACAGCATAAACTTTGACAACCCAGTCTACCAGAAGACCACGG 
AGGACGAGATCCACATTTGCCGCAGCCAGGATGGCTATACCTACCCCTCGAGACAGATGGTCAGC 
CTGGAGGATGATGTGGCATGAACAGCTGAGGGGAGCCATCTCTTTCCGGGATCCGCTGCCACCCT 
TAGGCAGGAAGGACGCTTTCTCACACCTCCCCGCCCTGCACTGGTCCTTCCACCTCAGTGGTCTC 
TGTGTTGCTCAAAGCAAGATAAGAGCAAAACTGGGCTGGGGCCAAGCTCAGCGGCCTGTCTGCCC 
TGGGTCCTGTTTTATATATTTATTGTCTGGGGACAGAAAAGG 

AAAGCTTGGCCAAAAGGGCGAATTCCAGCACACTGGCGGCCGTTACTAATGGATCCGAGCTCGGT 
ACCAAGCTTGATGCATAGCTTGAGTATTCTATAGTGNCACCTAAAT 


Hacrophage 
inflammatory 
protein-1 alpha 


« 
CI 


GCGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCG 

CCCTTATCGCGGGATCCTGGCGCTCTGGAACGAAGTCTTCTCAGCGCCATATGGAGCTGACACCC 

CGACTGCCTGCTGCTTCTCCTATGGACGGCAAATTCCACGAAAATTCATTGCTGACTATT^ 

ACCAGCAGCCTCTGCTCCCAGCCGGGTGTCATTTTCCTGACCAAGAGAAACCGGCAGATCTGCGC 

TGACCCCAAAGAGACCTGGGTCCAAGAATACATCACTGAGCTGGAACTAAATGCCTGAGATTAGA 

GGCAGCAAGGAACCCCCAAACCTCCGTGGGCCCCGTGTAGAGCAGGGGCTTGAGCCCCAGAACAT 

TCCTGCCACCTGCAAATCTCCCCCTCCTATAAGCTGTTTGCTGCCAAGTAGCCACATCCAGGGAC 

TCTTCACTTGAATTTTTATTTAATTTAATCCTATTGA 

TCCCCCAAGCTTGGCCAACAANGGCGAATTCCAGCACACTGC^ 
ACTCGG 


Hacrophage 
inflammatory 
protein-2 alpha 


U45965 


CCGCCAGTGTGTGGAATTCGCCTTTGNCGGATCCGCCAGCTCCTCAATGCTGTACTGGTCCTTGC 

TCCTCCTGCTTGCCCACCAACCATCAGGGACAGGTGAGACTCGAGGCTGACATTCTTGGAGGAGC 

CTCAGGTGGGCGCAGCCATGCCCAGGCCCTCTGACCCACTCTCTTCTCCTACAGGGGTTGTTC 

GCCAGTGAGCTGCGCTGTCAATGCCTTGACGACCCTACCAAGGGTTGACTTTCAAGAACATCCAG 

AGCTTGACGGTGACCCCTCCAGGACCCCACTGCGCCC^GACAGAAGTCATAGCCACTCTCAAGGA 

TGGTCATGAAGTTTGTCTCAACCCTGAAGCCCCCTTGGTTCAGAGGATCGTCCAAAAGATACTGA 

ACAAAGGCAAGGCTAACTGACCTGGAAAGGAAGAACATGGGCTCCTGTACCTCAACGGGCAGAAT 

CAAAGAGAAAAGAAACAAACTGCACCCAGGAAGCCTGGATCGTACCTGATGTGCCTCGCTGTCTG 

AGTTTATCTATTTATTTATATATGTATTTATTTATTC 

TACTATGATATTTAAAGATATGCATTGGCCAGCTCACTGTAAAGCTTGGCCAAGGGCGAATTCTG 
CAGATATCCATCACACTGGCGGCCGCTCGAGCAGCATTNAGNGATCTAANGGCCATNCAA 


Hacrophage 
metalloelastase 


1 X98517 I 


CCAGCTTGGGACCGAGCTGGGATCCACTAGAACCGGCCGCCAGTGTGCTGGAATTCGCCC 

GGAGTCCAGCCACCAAACATTACTTCAATTTCTTCCATGTGGCCAACTATCCCATCTGGTATTCA 

AGCTGCTTATGAAATTGGAGGCAGAAATCAACTTTTTCTTTTTAAAGATC 

TAAACAACTTGGTACCAGAGCCACACTATCCCAGAAGCATACATTCTCTGGGCTTCCCTGCATCT 

GTAAAGAAGATTGATGCAGCTGTCTTTGATCCACTTCGCCAAAAGGTCTATTTCTTTG 

ACAATATTGGAGGTACGATGTGAGGCAGGAACTCATGGACGCTGCTTACCCCAAGCTGATTTC 

CACACTTCCCAGGAATCAGGCCAAAAATTGATGCAGTTCTCTATTTCAAAAGGCACTACTACATC 

TTCCAAGGAGCCTACCAATTGGAATATGACCCCTTACTGCATCGTGTCACCAAAACATTGAGCAG 

TACGAGCTGGTTCGGTTGTTAGGAAGAAAAGGGCGAATTCTGCAGATATCCATCACACTGGCGGC 

CGCTCGAGCATGCATCTAGAGGGCCAATTCGC 
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GTTGGTACCGAGCTCGGATCCACTAGTACGGNCGCCAGTGTGCTGNAATTCGCCCTATCGCGGGA 
TCCACACAATTGCTGGCTTCTCACAGAGCTGTGACCTTTATCCAGGAGATGGTTTGTTTGAACTA 
CTTCCCAAGAATTGCCGTGGCTGCCCCAGGGAGATACCTGTAGACAGCCCGGAGTTGAAGGAGGC 
ACTTGGTCATTCCATTGCACAGCTTAATGCACAGCATAACCATATTTTCT^ 
CCGTGAAAAAGGCAACATCACAGGTGGTTGCTGGAGTAATATATGTGATTGAGTTCATAGCCAGA 
GAAACTAACTGTTCCAAGCAAAGTAAAACAGAACTGACAGCGGATTGTGAGACCAAACACCTCGG 
TCAAAGCCTCAACTGCAATGCTAACGTGTACATGAGACCTTGGGAGAACAAAGTCGTCCCGACTG 
TCAGATGCCAAGCACTAGATATGATGATTTCTAGGCCTCCAGGATTTTCACCTTTCCGGCTGGTG 
CGAGTACAAGAAACTAAAGAAGGAACAACTAGGCTCCTAAACTCATGTGAGTACAAGGGCAGACT 
CTCAAAGGCAGGCCTAGGAAGCTTGGCC^AAAGGGCGAATTCTGCAGATATCCATCACACTGGCG 
GCCGCTCGAGCATGCATCTAGAGGGCCAATTCGC 



o 
u 



GCGAATTGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCGC 
CCTTATCGCGGGATCCTGTCCCTGGAGGAGGAGGAGGAGGAAGAGGAGGAAGAAGGTTCTGGAAG 
TGAAGGTGCTCTTGGAAATGAAGGAGCTGTCTCAGGTCAGGATGTGACAGATGAGAACCTTCAGT 
GCCCCAAGGAGGAAGACACAACGAGTCTGATGGGTGACTCTGGATTCAAGACTGGTCGCTACCTC 
CTAGTCAGGAGGCCTGAGTGCTTTAACAAAGCTCAGTTGGTCTGCCGGAGCTGCTACCGGGGCAC 
CCTTGCCTCCATCCACAGTTTCAGTGTTAACTTCCGAATCCAGTCCTTTGTCAGGGGAATCAACC 
AGGGTCAAGTCTGGATTGGAGGCAGGATTGTGGGCTGGGGTCGCTGCAAACGCTTCCGATGGATT 
GATGGAAGCTCTTGGAATTTTGCATACTGGGCTGCTGGGCAGCCTCGTCGCGGCGGTGGCAGATG 
TGTGACCCTGTGTACCCGAGGAGGCCACTGGCGCCGATCTGGCTGTGGCAAAGAGACCAAACTTG 
GCCNAAAAGGGCGAAATTCCACCACACTGGCGGGCCGTTTNCTAGTNGGATCCCCAANCTTCGGT 

ACCCAAC 



(D 

3 si 



TCTATGAC ATGATT ACGAATTTAATACGACTC AC TATAGGG AATTTGGCC CTCGAGGC C AAG AAT 
TCGGCACGAGGCCTCGTGCCGCCTCGTGCCCCAAGGAGAACTTCAGTTGCCTGACTCGATTGGAC 
CACAACCGAGCAAAATCTCAAATTGCTCTTAAACTCGGTGTAACCGCTGATGATGTAAAAAATGT 
CATTATCTGGGGAAATCATTCATCAACCCAGTATCCAGATGTCAATCATGCCAAGGTGAAATTGC 
AAGGAAAAGAAGTTGGTGTGTATGAAGCCCTCAAAGACGACAGCTGGCTCAAGGGAGAGTTCATC 
ACGACTGTGCAGCAGCGTGGTGCTGCTGTCATCAAGGCTCGGAAGCTGTCCAGTGCCATGTCTGC 
TGCGAAGGCCATCTCGGACCACATCAGAGACATCTGGTTTGGAACCCCGGAGGGCGAGTTCGTGT 
CGATGGGCGTAATCTCTGATGGCAACTCCTATGGTGTCCCTGATGACCTGCTCTACTCGTTCCCT 
GTCGTGATCAAGAATAAGACCTGGAAGTTTGTTGAAGGCCTCCCCATTAACGACTTCTCCCGTGA 
GAAGATGGACCTGACAGCAAAGGACTGACCGAGGAAAAGGAAACGGCTTTTGAGTTTCTCTCCTC 
CGCATGACTACACAGTCGTGTTGACGTCAGCAAACAN 



.3 « 

It 



TTGCGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATT 
CGCCCTTATCGCGGGATCCGGTACTGGCAGTGGGCCTTC^ 

ACACCCAAGCGAGTGCTTTACTGCCAGCGTTCACTGCTGGACAAGGTCTGACCCCCACCACTGGC 

CCACCCGCTTCTACCACAAGGACTTTGCCTCCTCCGAAGGCAGTGGCAGCCGGTGGTGGCAGGTG 

GGCTGTTCTCACCCATCCTGGGCTCCCTCCCTCCAGCCTCCCTTCTCAGTCCCTAATTGGCTTCT 

CCCACCCTCACCCCAGCCTTGCTTCATCCATAGGTGGGTCCCTTGAGGGCTGAGCAGAAGATGGT 

CTGGCCTCTGGCCCTCAAGGGACCCTCATAGCTTGGTGTGTGTCCAACCCTATTTGAATC 

AAGGCTCTGCACTTGAAGGCAGGACCCTCTGACCTTACAGGCAAAGGCCAAATGGGGTCATCTGC 

TTCTCTTCCATCCCCCTAACTACATATCTTAAATCTCTGAACTATGACCTCAGGAGGCTTTGGGA 

AGCTTGGCCAAAAGGGCGAATTCCAGCACACTGGCGGCCGTTACTAGTGGATCCGAGCTCGGTAC 

CAAC 



t H 
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S P 



CTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATT 
CGGCACGAGGATAAAAATGGCTGTCCTCTGAGGCATTCTAGAGCTCACTTAAAAGGGGTGTTCAA 
TTTGCTGTCCTTGGTGCAGGTGTCCTGTCAAAGATTGGCAGGAGCGGTGAGAACCCATACCCTCC 
CCTGAACCTCCTGGCCGACTTTGGTGGCGGTGGCCTCATGTGCACATTGGGCATTTTGCTGGCTC 
TCTTCGAACGCACGCGGTCTGGCCTAGGGCAGGTCATTGATGCTAACATGGTGGAAGGAACGGCA 
TACTTAAGTACTTTCCTGTGGAAAACTCAGGCCATGGGTCTGTGGGCACAGCCTCGAGGGCAAAA 
CCTGTTAGATGGCGGGGCACCTTTCTACACAACCTACAAGACCGCAGATGGGGAGTTCATGGCTG 
TAGGTGCAATAGAACCCCAGTTCTACACACTGCTGCTTAAAGGACTTGGACTTGAGTCTGAGGAA 
CTCCCCAGCCAGATGAGCATAGAAGATTGGCCAGAAATGAAGAAGAAATTTGCAGATGTGTTTGC 
AAGGAAGACTAAGGCAGAGTGGTGCCAGATCTTTGACGGGACAGA 
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TCTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAAT 
TCGGCACGAGGCATAAATGACATAGGTTTCCATGTTGGAGTCGGTGGATAGAGAAGGCTCCCATC 
TCTGGGTAAGCGGCTCAGGCAGCCCCTCATGCTCCACACGGCATGTGTAATTCTGCTCCTTCCCA 
AGTGGCACCACCACAGATGCCCACTTCTGGAAGGTTCCATCCCCTGCAGGCCTGGTCTCCACAAG 
CTCCATGTCCTGGGTCAGGTCCTCCCCATTCAACTGCCAGCTCAGGGAGATGTCAGCAGGGTAGA 
AGCCCAGGGCCCAGCACCTCAGGGTGACATCACCTTCAGGTCTGGGGTGAAGGGTCACATGTGCC 
TTTGGGGGATCTAAGCGCAGCAGCGTCTCCTTCCCGTGCTCCAGGTATCTGCGGAGCCACTCCAC 
GCACGTGCCCTCCAGGTAGGCCCTGAGTCTCTCTGCAACACCAGCCCGATCCCACTTGTTCCGGG 
TGATCTGTGCCGCAAAGTCCGCCGCCGTCCACGTCTTCAGGTCTTCGTTCANGGCGATGTAATCG 
CGGCCGTCGTAAGCGTCCTGCCTTATTACCCGCGGAGGAGGNTTCCCGTTCCGTTCCCACGTCAC 
AAGCCATACATTCCTCTGGATGGGGGTGAAANNCCGCCCCTCGCTCTTGG 



O ** 00 

Is 

s 



CCCAGTGTGCTGGAATTCGCCCTTCGCGGGATCCTTAAATTTCAGGCACTTTC 
ACCCATATATTTAAAGCTTTTTGTGCAGTAAGAAAGTGTAAAGCCAATTCCAGTGTTGGAC 
CAGGTCTCGGTATTTAGGTCAAGGTGTCTCCATTCTCTATCAGTGCAGAGACATGCAGTTCTGTG 

ggcagggtaggaccctgcatcatctggagcccagaagga<^cgactggk:caggcctcaccgcct 

CAGTATGCAGTCCAGCTTCACGTCATCCCCTCACAATGGTTAGTAGCAACGTCTGGGTTTGAACG 
CCAGGCGTGGTTATTTTATTGAGGATGCCTTTGCACATGTGGCCATGCTGTGTTAGGA 
CCAGGGCCCGGACTTGAAGCTAGAGCTGGCAGAAGAGCTCCTGGCATCCATGGTGCGATGCTGCC 
GC C AC C C AGTTTCTCC ATTGG AAG AC AAGGGAATGAG AAGACTGCTGTGTAT 
CTTGGTTGTGATCTGGCTGCAGGGCCAAGGGCGAATTCTGCAGATATCCATNACACTGNCGGCCG 

CTCGAGCATGCATNTAGAGNGCCCG 



< 

0 



g 



TGGTGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATT 
CGCCCTTCGCGGGATCCGAGAAGAACTGGTGTGAGGAGCAGTACTCCGGGGGCTGCTACACAGCC 
TACTTCCCTCCTGGTATCATGACCCAGTATGGAAGGGTGATTCGCCAGCCAGTAGGTAGGATTTA 
CTTTGCAGGCACAGAGACAGCAACACAGTGGAGTGGCTACATGGAAGGAGCAGTTGAAGCTGGAG 
AACGAGCAGCTAGAGAGGTGTTGAATGCTCTAGGAAAAGTCGCGAAGAAGGATATATGGGTTGAA 
GAACCCGAGTCCAAGGATGTTCCAGCCATTGAAATTACCCACACCTTCTTAGAGAGGAACCTGCC 
TTCCGTGCCTGGTCTGCTCAAGATCACTGGTGTTTCCACTTCTGTGGCTCTTC 
TGTACAAGATTAAGAAGCTCCCATGCTGAAGTTTCACCCTCAGGCCTCCTGCACGATCATCGCAT 
GTGAAAGAAAGTGTGGATAAATTACAGCCTATGGTTTGGGCCATTTAAAGCTTGGCCAAGGGCGA 
^TTCCAGCACACTGGCGGGCGTTACTAGTGGATCCAACTCGGACCAAGCTTGATGCATACTTGAG 

TATTCTAT 



to 

*3 



GTCATCAAGCGAGCTOTAGCATTTAGGTGNCACTATAGAATAGGGCCNTNTAGATC^ 
GCGCCCGCGATATCGAATTCGCCTTTCGCGGGATCCATTTGGCAGCCAGAACCAGAATCTGTGGA 
TGTCCCAGCAAGACCCATTACCAACACCTTCCTGAAGAGACACTTTGCCTTCTGTACCAGGTCTA 
CTAAAGCTGCTTGGATTGACCACCATCTTGTCAGCAACAGCTCTTGGTTTCCTGGCCCACAAAAA 
GGGTCTGTTTGT AC GTTTCTAAAGATGGGCTTTAGG AC C ATATC C AC AGGTTTCTC ATTC AGTGT 
GTCACAAAAGCTTTTGGAAGGAGTTGGGATAAAAATCTGACAAAGGTGCAGAGATTATGGAGTC 

GAAAGCACAGTAACTTGGTCTCCATTTTGGCT^ 

AACTTTCCTGCACTCTGAATATTGAGAACAGATACACAGGCTCTCTCACAACCTACCTGCCCTAT 
GCACATAGTTGTTTTTCAAAACCCTATGCCTTTGTG^ 

CACCCTGCAGGGCCAAGGGCGAATTCCAGCACACTGGCGGCCGTTACTAGTGGATCCGAGCTCGG 
TACCAAGCTTGGNCTCAA 



W ifc W 

|| «a 
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GCGAATTGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCGC 
CCTTATCGCGGGATCCTGATCAGCATACTTGTGGCC^ 

CAAACAATAATGAGGAATATCTTGAGTTTGATCCTGCCCCTACTTGTCATGGTCATCTGC 
AGGAATCCTCCACACCCTGTTTCGCTGTAGGAATGAGAAAAAGAGGCATAGGGCTGTGAGGCTCA 
TCTTTGCCATCATGATTGTCTACTTTCTCTTCTGGACTCCATACAATATTGTTCTCTT 
ACCTTCCAGGAATTCTTGGGAATGAGTAACTGTGTGGTTGACATGCACTTAGACCAGGCCATGCA 
GGTGACAGAGGCTCTTGGAATGACACACTGCTGCGTTAATCCTATCATTTACGCCTTTGTTOT 
AGAAGTCCCGAAGGTATCTCTCCATATTTTTCAGAAAGCACATTGCCAAAAATCTCTGCAAAC^ 
TGCCCAGTTTTCTATAGGGAGACAGCAGACCCGAGTGAGAAGCTTGGCCAAGGGCGAATTCCAGC 
ACACTGGCNGGCCGNTACTAGTGGATCCGAGCTCGGNNCCCANCTTTGAN 
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TATGACATGATTACGAATTNANTACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC 
^ScGAGGACAGGGACAGATGGGCTGTGCGCGCTGCGCGAGCTGAGCGTGGACCTTCGAGCAGA 
GCGCTCAGTGCTCATCCCGGAGACCTACCAAGCCAACAACTGCCAAGGCGCCTGTGCATGGCCAC 
AGTCGGACCGTAACCCACGGTACGGGAACCACGTGGTGCTGCTGCTAAAAATGCAGGCACGCGGG 
GCC^CCTGGGTCGCCTGCCCTGCTGTGTGCCCACTGCCTACACCGGCAAGCTGCTCATCAGCCT 
GTCGGAGGAGCACATCAGCGCGCACCACGTGCCCAACATGGTGGCTACCGAATGCGGCTGCCGGT 
GATGTCCGCCCTACCCCATCCCCCGTGTCCCCAGTCAGCGCCCCAATAAAGATTAGCAAGCAAAA 
AAAAAAAAAAAAAAAAAACATT^GGCCGCAAGCTTATTCCCTTTAGTGAG^ 
TGGCACTGGCCGTCGTTTTTACAACGTCGTGACTGGGAAAACCTG 

TTGCAGCACATCCCCCTTTCGCCAGCTGGCGTAATANCGAAAAAGCCCCNCCCGATCGCCCTTCC 
C^C AGTTGCNC AC CCTGAATGGC AATGGGACCCCCCTGTAGC GGCGCANTT AAC CCGGCGGGGG 

T GGNG 

GAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCGCC 



s 

to 

T-t 

(0 I 
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CTTCGCGGGATCCTGGAACAGTGTTTCTAGATGGCAAAGAAATAAAGCAACTCAATGTCCAGTCG 
CTCCGCGCCCACCTGGGCATTGTGTCCCAGGAGCCCATCCTGTTTGACTGCAGCATCGCCGAGAA 
CATTGCCTACGGAGACAACAGCCGTGTCGTGTCTCATGAGGAGATCGTGAAGGCAGCCAAGGAGG 
CCAACATCCACCAGTTCATCGACTCACTGCCTGAAAAATACAACACCAGAGTGGGAGACA^GGG 
ACTCAGCTGTCGGGCGGGCAGAAGCAGCGCATCGCCATCGCGCGCGCCCTCGTCAGACAGCCTCA 
CATCrTACTTCTGGATGAAGCGACATCAGCTCTGGATACGGAGAGTGA^GGTCGTCCAGGAAG 
CGCI^ACAAAGCCAGGGAAGGCCGCACCTGCATTGTGATCGCGCACCGCCTGTCCACCATCCAG 
AACGCAGACTTGATCGTGGTGATTCAGAACGGCCAGGTCAAGAACACGGCACAAGC1TCGCCAAG 
GGCGAATTCCAGCACACTGGCGGCCGTTACTAGTGGATCCGAGCTCGGTACCAAGCTTGATGCAT 
A CTTGAGTATTCTATAGTGCACCTAAATACTTG 

ATCCCATGGCCGGACCAGTGTTCCTCGATGGTCAGNAAGCAAAGAAACTCAATG 
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AGCTCACTTTGGCATTGTGTCCCAGAAGCCCATCCTGTTTGACTGCAGCATCGCCGAGAACATCG 
CCTACGAGACANCAGCCGTGTCGTGTCTCAGGATGAGATTGTGAGGCGGCCAAGGAGGCCAACAT 
CCACCCCTTCATTGAGACACTGCCCCAAAAGTATGAAACAAGAGTAGGAGACAAGGGGACACAGC 
TCTCTGGAGGCCAGAAACAGAGGATTGCTATCGCCCGAGCCCTCATCAGACAGCCTCGTCTCCTA 
CTGCTGGATGAAGCCACGTCGGCTTTGGACACTGAGAGTGAAAAGGTCGTCCAGGAAGCGCTGGA 

CAAAGCCAGGGAAGGCCGCACCTGCATTGTGATCG^^ 

AC TTGATCGTGGTG ATC GAC AAC GGC AAGGTC AAGG AG AAGC TTGGCC AAGGGCG AATTCTGC AG 
ATA TCCATCACACTGGCGGCCGCTCGAGCATGCATCTAGANGCCCGGNT ACNAGNAAGNNCAN 
GGGGCGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAAT 



TCGCCCTTCGCGGGATCAGATCTTCATTGACGGGCTCAATGTGGCACACATTGGCCTCCATGACC 
TGCGTTCACAACTCACCATCATCCCTCAGGACCCCATCCTGTTCTCGGGCACGCTGCGCATGAAC 
CTCGATCCCTTTGGCCGTTACTCGGACGCGGACATCTGGAGGACCCTGGAGCTATCCCACCTGAG 
TGCATTTGTGAGCAGCCAGCCGACAGGCCTGGATTTTCAGTGCTCTGAGGGTGG^ 
GTGTTGGCCAGAGGCAGCTCGTGTGCCTAGCCCGAGCCCTGCTCCGAAAGAGCCGTGTTCTTCTO 
TTAGACGAGGCCACCGCTGCCATTGACCTGGAGACTGATGACCTCATCCAGGGTACCATCCGTAC 
CCAGTTTGAAGACTGCACTGTACTGACCATCGCCCACCGGCTCAACACAATCATGGACTACAACC 
GGGTCCTGGTCTTGGACAAAGGAGTAGTAGCTGAATTTGATTCTCCAGTAAACCTCATTC 
lAAGCTTGGCCAAGGGCGAATTCCAGCACACTGGCGGCCGTTACTAGTGGATCCGACTCGGTACC 

AAGC TTG ATGC AT AN 
TATGACATCATTACGAA^ 



3 



GGCACGAGGCAGCATCTGAATGCCTACCGCCAGGAGGCTCACAACCGCATCTCCAGCCACATTC 
ATTGATCATCCAGTATTTCATCTTGAAGATGTTTGCTGAGAAGCTGCAGAAGGG 
TCCTGC AGGAC AAGGATTC CTGC AGC TGGCTC CTGAAGG AAAAG AGTGAC ACC AGTGAGAAGAGG 
AGATTCCTGAAGGAGCGGTTCGCAAGGCTGGCCCAAGCTCAGCGCAAGCTAGCCAAATTCC^ 
TTAAGCTGGCCCTGTCCTTTCCTGTGTCTCCTGGATAATGATTCAGGGACAGAAGGGCTCCTGCC 
TTCC CTTCAGCTAACC ACTAC C CTTT ATC CTATTATAAATATTAGTTCTAAGATGTGAAGG AGCT 
TTCTGrrCACTCTGAGATGATAAAGAGAAAGAGATTCTCAAAACTCAGCAATTAGATGAGTAGGA 
GAAGCCACTTTGCTGATAAGACAATAGCTTCAGTCTGAGTACCATTCCTATTCACCATATCCTCA 
TTITAGAACCCTGCCAGGAACAAATATTTGCTGAACAAATGGGCCATCATGT^ 
ATAGCATTTCTTAATAAACTACNTTTCTATGGAAANNGGGGGGGGGGGNTT 
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Itggaccatccccggcatggcttcctcccaaggcacagagacacgggcatccttgactccatcgggi 
c^tc^agcg^acaggggtgcgcccaagcg^^ 

tctocaagaa^ttgtgacacctcgtacaccccctccatcccaaggaaaggggagaggccto 

cScag^agctggggaggaagagacagc^ 
gcctccctcctcagccttcccgaatcctgccctcggcttctt^ 

Icaaaagggcgaattccagcacactgg^ 

Ia^^^^a gta^ctatagtgtcacctaaatagcttggcgtaa tcatgggcatagctggt 
cnaangggnattaaatttggnaacccaaggtttttccnacccccccnot I 

GGCCNAAATNAAANTNANCCCCCNAAAAANGNAAAANCNTTCNN^ 

£cggcagagac^0£ 

.cctcg^agggacccctctcactattgcacgatcct^ 

iArrr^G^CTG^CCAGCCATGGCGGTAGGAGTACATGCCTGTGAGCAAAGGCCGCTAGGGGTC I 

ITANCCNNNC 

Ianngggccctctagatgcatgctcgagcggccgccagtgtgatggatatctgcagaattcgccct 

ItCGGTTTCCTCCCCTTOCAC^^ 

Ir^AGGACAGCTACGGGCAGCAGTGGACCTACGAGCAGAGGAAGATT^ 

Icac^gccotctttctcagtatcgtggtagtgcagtgggctc 

IggAATTCTCTCTTCCM 

agctcatcatcaggcgacgccc^ 

IcTCCACGCCGTCGAACA^^ 

Icggccg^ac^tcgatccgagctcggtaccaagc 

jrACCTAAATAGCTGGCGTAATCATGGGCATAGCTGTTNCTGTGTGAA _ 

Itctcntatgacatgattacc^tttaatacgactcactatagggaatttggccctcgaggcc/ag 

AATTCGGCACGAGGGGAAGGTTTTTTT 

GT^GTOTCTC^ACAATTAACTGTCACTTTTTCTTTGCTCTAATGTAAATGAC 

Kctccaagggtcaacaatcactacactatgtttc 

CCCCTACTGTCCATCAGTGTGAACACAGGGAGCTTTTGTGTGCAAATC 

gagactcgaacaaItagaatgtgtgtag^ 
tctctccaaWc^^ 
ctcgagctttctgctctttgatttc^ 
Ica^xgcmttcacccatgtg^^ 

CTTCTAAACTGGTAATTT 



u 

il 
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: NG NGNGl5jTGGCGGNANGN^ 
*Sn^GAGAGAClTOC^^ 

ICNCCGGAAAACNNNGGAGCGCNNNGGTANGACNGCNGTCGNGGCAGG 
CTCCANAATT^NCCNNAGTCNCGGGANCANNGNNG^ 

agaaagaagaaa^aaganatnccga^^ 

jCGTCAAAGNAGAAGCAGCTTNGTGGAAAAGAATNAAN^ 
iCGCTACGGGATCNGGGGCAAGNCCGCAG^ 

Irrrrc agatcgacaagtcccci^tagtcttctgcatggccacatacggagagggcgaaccccaci 

lACAATGCGCAGGACTTCTA^^ 

CAGCGGCTCGAGNAGCTTGGCGCCCAGCGCATCT^ 
CTTGGAAGAGGATTTCATCACGTGGAGGGAGCAGTTCTGGCCAGCTGTGTGCGAGTTCTTTGGGG j 

Itagaagccactggggaggagtcgagcattcgccagtatagctc „■ 
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" Iatc^ctcccggcgctcggatgaggactatctc 
Iggtgccctcacgcagcttaatgtggccttttcccgggagcaggc^ I 

1CCGTTACTAGTGGATCCGAGCTCGGTACCAAGCTTGGNCTCAA 



CCA^AC^l^CTATCTCAMGACTGGGGACCCCGCTTCAAGAAGCTCGCCGACATGTAT^TC 
TTCCCAAAAAGCATOC^ 
IrPTAAATAGCTTGGCGTAATCATGG 



18 



rrflATTGGGCCCTCTAGATGCA^ 

|TG<^ CTCT ACCC C CTCTATTTAGC ATGAAGGAGC 

rCCCTTCACA 



SO) A> — 
> 0) 
•H H 
(U 4J U 
H flJ Q) — 
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TTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCA^AATT 

KSgagggctagagccagccagtcacccttac^aggatccc 

GCCGTG^ATTCCATGTCCTCGATTTCrCAATTCCTCAG 

- IcTCTA^AGOTAACTTTGGGCCCACAGACCCGAGAGCTGTGGGTTGAAGCAAAGCTGTC 
~ rCAGATGGTCrcTCTGTTCTCTCCACACACAGGTCCCCGCCTTTTTAGAAGCAGCCTC 
CATCCTTAAATCTCTTCCT^CTGCCCGTGTTCACTTTAGAAATGGCAG 
iTC^GAGctGGCCTCTCTCTCTCATTAAATAAAAATAAGTAAGTTTC 
GAGACAAAGG^ACTGA'ITrcTACAATAGCGCTTTTATATGGAAGACTC 

Ir^^CGCAAGCTTATTCCCTTTTAGTGAGGGTTAATTTTACC 



IN g M 

(0 



TATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC I 
iScGAGGC^ACTMTACTOCTCAAGAATGAATAAA^ 
GAATACATCGAACAGTOCA^GCTGGAAAAGTGCTG^^ 
ATGGTGGGATGTGAAAGACCAACACCGTATTCTGTATCTC 

- JcTAAAAGAGAAATTAA^ 

8 Iaaaa^ATCTACCACACCTCCTTTGATGTAATGAAGGAAAACCCAAT^ 
3 Iac^TCCAGTATCATGGACCACTCTATATCTCCTTTCATGAGGAAAGGGATGCC 
3 AGAACTAC^ACTCTCGCACAAAGTGAGGATTTTGATGAAGACTACCGGA 

a^ta^ctccW 
actatatgctttagctatct 

lr&A&T(^AAATCAGACCTTCCACACTAGGTGATTATTCTTATTGATACC - 
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Notch 1 


X57405 


Stcaagtatgcatc^^ggttggtaccgagctcggatccactagtaacggccgccagtgtctgga 

^TTCGCCCTTATCGCGGGATCCCCTGGGTCGGAGCTTCCTGAGCGGGGAGCCCAGCCAGGCAGAC 
jTACAGCCGCTGGGCCCCAGCAGTCIX^CTGTGCACACCATTCTGCCCCAGGAAAGCCAGGCTCT 
3CCGACATCACTGCCATCCTCCATGGTCCCACCCATGACCACTACCCAGTTCCTGACCCCTCCTT 

:tcagcacagctactcatcctcacctgtggacaacacccccagccaccagctgcaggtgccagag 
caccccttcctcaccccatcccctgagtcccctgaccagtggtccagctcctccccgcattccaa 

catctctgattggtccgagot^ 

rTCCAGAGGCATTTAAGTAAACAGAGATGTGGGATGAAGGACCCCAGCTTCCGTTCCCAAGCTCT 
GTTGGGAGTCCTTTCCAGTGCTCCAGGATGCTGGGGCGACCAAAGGAGCCTTAAGCTAAGGGCGA 

attctck:agatatccatcacactggcggccgctcgagcatgcatctagagggcccgattcgccaa 


Organic anion 
transporter 3 


AF041105 1 


TTGGC GAATTGGGC C CTCTAGATGC ATGCTCGAGCGGC CGC C AGTGTGATGGATATCTGC AG AAT 
TCGCCCTTCGCGGGATCCGCATCCGGTCATCAGGAAATTCATCTGCAGTCCTGGGGTTATGTAAA 
AAAGGCCCTGAGTGTGCTAATAAACTACAGTACTTTTTAATCATGTC 

CTACTCGATCACAGCCATACCTGGGTATATGGTTCTTCTGAGGTGTATCAAGCCTGAAGAGAAGT 

CGCTTGGGATTGGATTACATGCATTTTGCACAAGAGTATTCGC^ 

TTTGGCGCTTTGATAGACAGAACCTGTTTACATTGGGGAAC 

ATGCAGGATGTACAATATAAATAACTTCAGGCGCATTTACCTAGTGTTGCCTGCAGCTCTTAGAG 
GATCAGGCTATCTCCCTGCACTCTTCATTCTGATACTTATGAGGAAATTCCAGTTCCCTGGGGAA 
ATCGACTCTTCAGAAACTGAACTTGCAGAGATGAAGATCACAGTGAAGAAAAGTGM^ACAGA 
TGTGCAAGCTTGGCCAAGGGCGAATTCCAGCACACTGGCGGNCGTTACTAGTGGATCCGAGCTCG 


Organic anion 
transporter Kl 


D79981 | 


TGCGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCM^TTC 
GCCCTTGTCACCAGGAAACTCGTCCGCAGTCCTGGGGCTGTGTAATAAAGGCCCCGAGTGCACCA 
ACAAGCTGCAGTACCTTTTAATACTATCAGGATTTCTCAGTATCCTCTACTCATTCGCAGCCATA 
CCTGGATACATGGTTTTTCTGAGGTGTATCAAGTCTGAAGAGAAGTCACTTGGGATTGGAATACA 
TGCGTTTTGCATAAGAGTATTTGCTGGCATTCCAGCACCTATTT 

GAACCTGTTTACACTGGGGAACTCAGAAATGTGGTGCGCCAGGGGCGTGCAGGATGTATGATATA 
AATAGCTTCAGGCGCATTTACCTTGGGATGTCTGCAGCTCTAAGAGGATCAAGCTATCTCCCTGC 
ATTTGTTATTGTAATACTTACAAGGAAGTTCTCTCTTCCTGGGAAAATCAACTCTTCAGAAATGG 
AAATTGCAGAGATGAAGCTCACAGAGAAGGAAAGCCAGTGCACAGATGTGCAAGCTTGGCCAAGG 
GCGAATTCCACACACTGCGGCCGTTACTAGTGGATCCGAGCTCGGACCAAGCTTGATGCATAGCT 

TGAGTATTCTATAGGNCACCTAAN 


Organic anion 
transporting 
polypeptide 1 


L19031 1 


ATAGAATACTCAAGCTATGCATCAAGTTGGTCCGAGCTCGGATCCACTAGTACCGGCCGCAGTGT 

GCTGGAATTCGCCCTTCGCGGGATCCAACTCGTCTGCAGTCCTGGGTCTGTGTAAAAAAGGTCCT 

GAGTGTGCCAACAGGCTGCAGTACTTTTTAATCTTAACAATAATTATCAGTTTCATCTACTCACT 

TACAGCCATACCTGGGTACATGGTTTTTTTGAGGTGTGTCAAGTCTGAAGAGAAGTCACT^ 

TTGGATTACATACATTTTGCATAAGAGTATTTGCTGGTA 

TTGATAGACAGAACCTGTTTACATTGGGGAACCCTGAAATGTGGTC 

GTATGACATAAACAGTTTCAGGCACATTTACCTGGGGTTGCCTATAGCA 

ATCTGCCTGCCTTCTTCATTCTGATACTTGTGAGGAAATTCCAGTTTCCCGGGGACATTGA 

TCAGCAACTGATCATACAGAGATGATGCTCGGAGAGAAGGCAAGCGAGCACACAGATGTGCAAGC 

TTGGCCAAGGGCGAATTCTGCAGATATCCATCACACTGGCTC 


Organic cation 
transporter 2 


1 D83044 1 


GAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCGCC 

CTTATCGCGGGATCCTTGGTGTCCTTCTCTGCTCCTCCATGTGTGACATTC 

CCTTTCCTCGTCTACCGTCTCACGGACATCTGGATGGAGTTCCCACTGGTTGTATTTGCTGTCGT 

TGGCCTTGTCGCTGGGGCACTTGTGCTGTTGCTACCTGAGACCAAAGGGAAGGCTCTGCC 

CCATCGAGGATGCCGAGAATATGCAGAGGCCAAGAAAAAAAGAAAGAAAAGAGAATTTACCTCCA 

AGTCAAGCAAGCAGACCGTCCGCTAAGCTAAAAAGAAAGGGCATCATTGCTGCTGGAGCTGACTT 

TGCTCTCTCTGAGGCCAGAGATGGAGCTTCTCTCTCCCCTCCCCCCAAACCCACACAAACCAACC 

TCACTTACCCCTGAACTCCATCAGCAAGAGCTGTAGCTTGCACGGTCTGTTGCACTCATGTGTCA 

AGCTCTTCCTCCCAG£CAGGATTTTCCGCCTCACTC 
CCAGCACACTGGCGGNCGNTCTAAGNGGATCCAG 
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GCGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCG 
CCCTTATCGCGGGATCCCCTCTAGGTGGCGCCGTTCCAAGGAACCGTAAAGATTCCAGAGTATra 
TATATGGAATAAGATAAAGATTTATCATGCTACCCAAGCTTCTGGGAACACATAACCTTTGACCT 
C^A^AAAGATCCAGTAGGTACACAGATTCTCCCTAGGTTCTCTTTCATTGGGACC^TCTCT 
CTCACCCACTGGGGAGGAGATATGAGAAGGTCCCAACCAGTrCTCCAGTTCTCAGGACAGTCCCA 

tccaggaagcaggatgaacatatcaaggcctggcagtgaacggaagtgtactcagctgmctcat 

CAAGACTGAGCAGTGAGGAGTGTTCTGAAGAGTCTGAGAACTGCTAT^ 
C^AATACATGCCTTCAGAGAGCCGACATGCCTGGTGC^ 

GAGGTTCCAGGCCCCAGGTTAGGCACCTCGAGGGCAAGGGCGAATTCCAGCACACTGGCGGCCGT 
TACTAGTGGATCCGAGCTCGGTACCAAGCTTGATGCATAGCTTGAGTATTCTATAGTGTCACCTA 
AATAGCTTGGCGTAATCATGGNCATAGCTGTTTCCTGNGTGAAATTGTN 



a u 



IWTTCACNCAGGAACCAGCTATGNCCATGATTACGCCAAGCTATTTAGGTGNCACTATAGAMAC 
TCC^GCTATGCATCAAGCTrGGTACCGAGCTCGGATCCACTAGTAACGGCCGCCAGTCTGCTCG 
AATTCGCCCTTCTGCTGTGAGAGGGAAAGGGTTGCTAAATGCCATCGTCATCCGAGAAACCAAAG 
ATTCTGACGCTTGGAAGGTGTGCCTGCGACTCCGAGATAATGGGCTTCTGGCCAAGCCAACCCAT 

GGT<3ATATCATCCGGCTTGCCCCTCCACTTGTGATGAAAGAGGATGAGATC 
GATCATCAACAAGACCATCTTGTCCTTCTGAGAGTAC^AACTCTGGGGAGCCATC^AGATCGG 
GCTCTTGTGAAACTCTGCTTGGGATGGGCAGATTCGGCTTGTCTGTCTCCTAAAAGACAATTTTT 
TGAATATGTATTATATATTTCAGTTGATGCATAGTGGAGTGACACCTAGGAACCTGGAGGTGCCT 
GCGTGACACAAGAGTCAGAC^GAGAGGCATCTCTTTGTrAAAGTTTGACTGTGTGTCAGC 
AAGGAGAAACAGATCTATCTGCATACAGCCTGCAGAGTCCTGCCGTAATAAGGGCGAATTCTGCA 
GATATCCATCACACTGGCGGCCGCTCGAGCATGCATCTAGAGGGCCAATT 



S3 

5 1 

u <u 
o u 



TTTOAGAATTGGNCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAAT 
TCGCCCTrATCGCGGGATCCCATCGAGAACCAACCATGGGCAGCTTTACTAAG^ 
TCCCATATCCTCGATGAAGGTTTCACTGCTAAGGACATTCTGGACCAAAAAATCAATGAAGTTTC 
TTCCTCTGATGATAAGGATGCTTTCTATGTTGCGGACCTCGGAGACGTTCTAAAGAAGCATCTCA 
GGTGGCTGAAAGCTCTTCCCCGTGTTACTCCCTCCTATGCTGTCAAGTGTAATGACAGCAGAGCC 
ATAGTGAGCACCCTGGCTGCCATTGGGACAGGATTTGATTGTGCAAGCAAGACTGAAATACAGTT 
GGTGCAGGGACTTGGGGTGCCTCCAGAGAGGATTATCTATGCAAATCCTTGTAAGCAAGTGTCTC 
AGATCAAGTATGCTGCCAGTAATGGAGTCCAGATGATGACTTTTGACAGTGAAATTGAGTTGATG 
AAAGTTGCCAGAGCACATCCAAAGGCAAAGTTGGTTTTTGCGGATTGCCACTGATGMTC 
CAGTTTGTCGGCTCAGTGTTAAGTTTGGTGCCACACTGAAAACCANCAGGCTTCTCTTGGAACGG 

GCAAAAAGAGCTAAA 



ID 
O 



TTGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTCG 
GCACGAGGCAGGACAGCAACGGGAAGACCAGCCATGAGTCAAGTCAGCTGGATCAACCAAGCGTG 
GAAACACACAGCCTGGAGCAGTCCAAGGAGTATAAGCAGAGGGCCAGCCACGAGAGCACTGAGCA 
GTCGGATGCGATCGATAGTGCCGAGAAGCCGGATGCAATCGATAGTGCGGAGCGGTCGGATCCTA 
TCGACAGTCAGGCGAGTTCCAAAGCCAGCCTGGAACATCAGAGCCACGAGT^ACAGC^TGAG 
GACAAGCTAGTCCTAGACCCTAAGAGTAAGGAAGATGATAGGTATCTCAAA^GCA^CT^ 
TGAATTAGAGAGTTCATCTTCTGAGGTCAATTAAAGAAGAGGCAAAACCACAGTTCCCTACriTC 
CTTTAAATAAAAACAAAAAGTAAATTCCAACAAGCAGGAATA^^ 

GTGGATACATGTATGTCGAGAAAGAAATAGATAGTGrmGGGCCCTGAGCTTAG^G^ 
CATGCAGACACCACTGTAACCTAGAAGTTTCAGCATTTTGCTTCTGGTCTTTCTGTGCAAGAAAT 
GCAAATGGNCACTGCATTTTTAATGATTGCTATTCTTTTATGAATAAAATGTATG 



m 
a 



TTGGCGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAAT 
TCGCCCTTCGCGGGATCCAGGCAGACTTTTCGGCACAGCGTGGTGGTACCGTATGAGCCACCTGA 
GGTCGGCTCCGACTATACCACTATCCACTACAAGTACATGTGCAACAGCTCCTGCATGGGGGGCA 
TGAACCGCCGGCCCATCCTTACCATCATCACGCTGGAAGACTCCAGTGGGAATCTTCTGGGACGG 
GACAGCTTTGAGGTTCGTGTTTGTGCCTGTCCTGGGAGAGACCGTCGGACAGAGGAAGAAAATTT 
CCGCAAAAAAGAAGAGCATTGCCCGGAGCTGCCCCCAGGGAGTGCAAAGAGAGCACTGCCCACCA 
GCACAAGCTCCTCTCCCCAGCAAAAGAAAAAACCACTCGATGGAGAATATTTCACCCpAAGATC 
CGTGGGCGTGAGCGCTTCGAGATGTTCCGAGAGCTGAATGAGGCCTTGGAATTAAAGGATGCCCG 
TGCTGCCGAGGAGTCAGGAGACAGCAGGGCTCACTCCAGCTACCCGAAGACCAAAAGCTTCGCCA 
AGGGCGAATTCCAGCACACTTGGCGGCCGGTACTAGTGGATCCGACTCGGTACCAAACTTGATGC 

ATAGCTTG 
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Pancreatic 
secretory trypsin 
inhibitor type II 
(PSTI-II) 


M27883 1 


;WNNNCTNCTATGACATGATTACGAATTTAATACGACTCACTATAGG^ 

^AG^TTCGGCACGAGGGCATGAGAAGACCCCAGTGAGCGAGAAGGTCACCAAGTCCTCTAG^ 

SGTCCCT^TGGAAAGACGGCCATGTTTCTCTGCTCTGACAGTTGACGAGACATATC 
^G^A^G^AGACCTTCACCrrcCACTCTCATATCTGCACACTCCCAGACAAGG^GCA 

Staaagaagca^cggctctcgctgagctggtgaaacacaagcccaaggccacagaagatcagc 

reAAGACGGTCATGGGTGACTTCGCACAATTCGTGGACAAGTGTTGCAAGGCTGCCGAC 

Sctgcttcgccactgaggggccaaaccttc 

rcA^AACCATCTCAGGCTACXICTGAGAAAAAAAG^CATGAAG^ 

3GTGTAAAACCAACACCCTAAGGAACACAAATTTCTTTGAACATTTGACTTCTTTTCTCTGTGCC 
r;rftATTAATAAAAAATAGAAAGAATTNGNNCAAAAAAAAAAAAANANAAAAAGNGNGGGG 


Pancreatic 
secretory trypsin 
inhibitor type II 

(PSTI-II) 
(alternate clone) 


M27883 I 


3NNNNNTCTATGACATGATTACGAATTTGAATACGACTCACTATAGGGAATTTGGCCC 

CA^VGAAT*TCGGCAC^AGGCTGAAGAGAAGCACCCTGCACAGTTC 

CraCAACCATCAAGGTAGCAATTATCTTTCTTCTCAGTGCTT^ 

AACACTACAGCTAAGGTGATTGGGAAAAAGGCTAAT^ 

TOACTATGATCCTCTCTGTGGTACTOACGGAAAAACTTACGCC^ 

AAAACAGGAAATTTGGAACATCTATCCGCATTCAGAGGAGAGGGCTTTGCTGAATGTC 

CG^A^TCTTTOA^GGC^CCATAATGTTTAGCAAGAAGGTTTGCTGAATAAATGCATC 

TtgGCACTGGCCGTCGTT^ 

rA^r^Ar^ArATCCCCCTTTCGCCAGCTGGCGTAATAGCGAAGAGGCCCGCACCGA 


PAR interacting 
protein 


U83590 | 


CTATGACATCATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAMAATT 
CGGCACGAGGGCITTCTTGGGCATGCTTCAAGGCAAGCAGCAGAAGCTG^ 
AAG^AA^TCGTCTGGCTCCAGTCGCCTCTACGATCTCTACTGCKAGGCCA 
GGAG^AGCGTCCA^GTCAGAAAAGAAGAATGTGAAGGATATTCC 

CTGAGGGTACCACATCJ^AAAAGAAGGCT 

CCTGC^GCCA^roGTAAAGACCAGCCCCCCAGCACAGGCAAGAAAAGAAGGAA 

TAACA^CCATCCCAGGTCAATGGAGTAACTGTGGCCAAGAGTCCGGCTCCCAATAA 

^GTCCCAGCACCCCCCCTGCCAAGACCCCAAAAGTGCAGAAGAAAAAA 

GTGAATGGATCCACTCCTGTGTCCCCCCGTAGAGCCTGAAAGCAAAAAGCATC 

nrart & AGGAGGTC AAAAAGAAAGTCCTCC AGTCTGGCCTGNC AAAAAAAAANAA __ 


Peroxisomal 3- 
Jcetoacyl-CoA 
thiolase 2 


M32801 1 


GAACCAGCTATGCCATATTACGCCAAGCTATTTAGGTGCACTATAGAATACTCAAGCTATGCATC 
AAGTTGGTACCGAGCTCGGATCCACTAGTAACGGCCGCCAGTGTGCTGGAATTCG 

a^^C^^GGATGGAGGCTCTACCACGGCTGGAAACTCCAGTCAGGTGAGTGATGGAGCA 

A^gg^^agccctgggccaccccctgggctgcaccggagcaaggcaggtggtcacgctgc 
tcaatcagctga^^^cgaggcagacgggcttatg 

:.^™^ir^m^^7v-Tv^af'iirTV^r«rrf^TCGAGCATGCATCTAGAGGGCCCAATTC 


Peroxisomal 
multifunctional 
enzyme type II 


n 
a\ 
m 
*r 

<N 

3 


■ TACATGATTACTACTTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTCGGC 
IcGAG^CTCGTCCCGAGAGGACCCTC 

TTCAAGAGTCCACAGGTGGTATA^TCGAAGTTTTACATAAAAT^ 

TCCT^m^^^CATATACGGAACTGCAGTGCATTATCTATGCCCTCGGAGTAGGAGCTT 
ACATTTGGAGTCATTCTCGCTCAGAAGTCCTTGATC 

gattgttatotacgtotattcttattctggcaaggacttatatgctatnatcagttc 



36/73 



WO 03/100030 



PCTYUS03/06196 



Peroxisome assembly 
factor 2 


D63673 | 


TTNGCGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGACGGAGTTGTGTC 

TCAGCTCTTAGCTGAGCTGGACGGGCTTCACAGCACTCAGGATGTGTTTGTGATCGGAGCCACCA 

ACAGACCAGACCTCCTGGATCCTGCCCTTCTGCGGCCTGGCAGATTTGACAAG 

GGAGCAAGTGAGGACCGGGCCTCCCAGCTGCGCGTTCTGAGCGCCATCACACGAAAGTTCAAGCT 

GGAGGCCTCTGTGAGCCTGATGAATGTGCTGGATTGCTGCCCGCCTCAGCTGACCGGCGCAGACC 

TCTATCCTCTCTGCTCTGACGCCATGATGACCGCCCTCAAACGCAGGGTCCGAGACCTAGAGGAA 

GGGCTGGAGCCTCGGAGCTCAGCACTGTTGCTCACCATGGAGGACCTGTTGCAGGCCGCAGCCCG 

GCTGCAGCCCTCAGTCAGCGAAAGCTTGGCCAAGGGCGAATTCCAGCACACTGGCGGCCGTTACT 

AGTGGATCCGAGCTCGGTACCAAGCTTGATGCATAGCTTGAGTATTCTATAGTGTCACCTAAATA 

GC TTGGC GT AATC ATGGTCAT AAG CTTGTTTC CTGTGTG AAATTGGT ATC CGC T AC AATTNCAC A 

CAACATACGAGCCGGAACATAAAGTGTAAAACCTGGG 


Peroxisome 
proliferator 
activated 
receptor alpha 


M88592 


TTGNGATTGCGCCCTCTAGAGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCG 

CCCTTGCGCGGATCCTACGGCAATGGCTTCATCACCCGAGAGTTNCTAAAGAACCTGAGGAAGCC 

ATTCTGCGACATCATGGAACCCAAGTTTGACTTCGCTATGAANTTCAATGCCCT^ 

ACAGTGACATTTGCCTTTNTGNGGCTGCTNTNATTO^ 

ATAGGATACATTNNNAAGTTNCA 


o 
t-t 
i 




TCTCTANNACATG^TTACGAATTTAATACGACTC 

ATTCGGCACGAGGCAGGATAACACCJ^CGCCAGGCCTTCACTTGTTCTCCATACTTCTCTACGGT 

TTTTGAGGTTAACTGTTGTTAAGTGTGAAGTGTTTGCCTAGAATAAGAGCTTA 

AAC AATTC TTTTCTG ATGACTTGG AGGG AC AACGTAGGGGC AG ATC TTGC AATC ACTG AAAC TGT 

TC ATCCTC TGAGCCTC AATTATAGCC CTC C AC C AGTCC CCAC ACTATC AAACATGTAACCTAGTT 

TTACTTATCCCAGCATGGTGATGACTGCTTTTCTGCTCAATACCACTTTC 

TAATTCTATCCATTTCTTGCTTCTTAAAATACTTCTCAAGGTACCTCAGGATGAAGTGA 

AGCTTTTTGTTACATGTGCCATTCTTCATGGTTTGCC 

CATTAATTCAGATACAAATAAATCCTTAGACAAAATATTTCACAAATGGTTTTTGA 

C AAAGG ATCTCC AATCC TG AAAGTGAATAGACTCTTGC CTTAAGT AGC CTT AC CTATT ATTTGC A 

AAATATGTTATTCTTTCTTTTGACAGCATTTCTC 


RCT-101 




CTCTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAA 

TTCGGCACGAGGACCCTGCCCATAACCTTTCCTCTGGTCCTTCTTCCTCTGTCTTCTC 

ATCnTTGTACCTGTTCTCTCCTGTCCTGCCCCATCCTCAGTAGCATCCCCCGGTGCTGCCTGGCC 

CCTGCCTCAGCAGTCAAATTCCCAGAAGCAGTCCAGGCGGCTGGGCTGACTCCTGAGACCCCCGC 

AGAGATCTTGGCCCTGGAGCACAAGGAGACACGCTGTACCCCAATGCGGAGAGGAGATGACTGGA 

CACAGATGKTTGCGGGACACGATTGAGGCCCTGTCCCTGCGCTGGAAAGGCTGTGTAGAGAACACC 

gcagaatagccaggctggaatgggtttgaagacttttgtcccagggacggctgggacaagatggc 

tcagtagttaagaacttttgctgk:tcttccaggagacccagtgcccatgtcagaagactc 

atctataacttcagctgtaggttttttgacacctctgacctctgtaggcaatgcactcatacaca 

gatacacatacctatatgtaattaaaaataataaagataagccttcttttctaaaaaaaaaaaaa 

aaaaaaaaaaaaaatgngggnggncgcnagcttn 


RCT-102 




TTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATT 

CGGCACGAGGTGAAGATGAAAAGCAAAACAACCTTTTTGATAAGCAAGGTATAGATTTTACATTT 

TTGTCCTTGCTCCCAATGAAATGGATAAACAAAAATAAAATCTGACAATGCCGTCTCTTCACGGA 

ATGTTGTTGTGTTAGCCGGACTGAAAGCCCACCTTAATTTTTATATAACGTCTTTA 

TTGACAGGGCAGGCCTTGTTCTGAACTGTTTGCGCTTCTGACTGTT 

ACTGTCCCTCTCATTCTCTCTTGCTTCCCCTCTGGCCTGAGTTTCTTGTGCATCCTCCCTACCCC 

CACCTCTGTTAGGGTAGATATATCAGCTATGTAAATAGAGCAAGGAAACGGTATTGTGCATTTGT 

GGCATTTACGNAGAGTTGCAGTTGTACGCTGCTGAAAACGCANGCTTTTTGTAACATC 

TCCATAAGTACCCNATGTATTTTAGNCTATTTTAGTCGTATTTGNTCNATAAAATA 

TANGGTAAACANANAACAAANAAAAAAANACCOTTGCGGCCGCCAAGCTTATTCCCTTTAATGAG 

GGGTTAAATTTTAGCTTGGGCCTGGCCGCCGTTTAACAANGNCC 
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L^^TGCCTCTCCAGGGCTGGGATTTTCTTC^ 



S 



IS 

M-rTrmraiA A &T TTTCAMAACATTCATTACATTGCATTGCC — — - 

ItTTATGACATGATTACGAATTTAATACGACTCACTATAGGGAA^ 



s 



ITCGGC ACGAGGGAGCGGCC^ I 
OTA^ATTTCTAAAAGT^TCTCTAGTTTTTCTT^ 

JrAtrcTTGGCACTGGCCGTCGTTTTACAACCGTCGTGACTC I 

IAATCC ^ — ^ 
iTCTCAAAGTGATTTTTCCCAAGNATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAAT 

GCAGATCAGGTCGACACGTCGGGGTCAGTGAGTCCTACATCAGGAAGAAGTA^ 

g^tcggtccAcaaaS 

r^S^CTCCG^CCTCAGGGCAGCTCTTCCGATTCAGCTGCTCCTACTACTCC^C 
GCAG^AG^GA^GGA^CAG^TCCAAAGCAACCCCAGGCTGTC 

cmaggtcatcatatggS 

ICTTOGGGACATGGGACAGTAWJVCAACAGGAGGCAAACAAAACTTG 

Ittganactctga^ 

Eotc^c^actgIcS 
ScSatcaStcatgg^ 

Kgc^gagSctcaa^ac^tcacctcaatgataagtcttcgacttgc 

ITTCGGCACGAbOU r^Q^aTGCAATGACACTTACTGGCTCATTCTTGTCAAGTTGGATGCGG 
3^GT^C^TCTGTCCACCATCAGTAGAAGGCCATTGAAGCCCCA 



Ictcactgttcaagtcct I 

ItGTC TGTCTCTGTCC CCTTTGC ACCG^TGTGTGTGTGTGTGTGTGTGTGTGT 

I ttatgacatcattacgaatttaatacgact^ct atagggaatttggccctcgaggccaagj^ttI 

ca^accctc^^tcggacactgcgaagggctaccaccttctcaaacccatgggc^ 

acctgctggaatcagttcagtgtgacagctctgcagctgct^ 

P^TPTrTGTGCTGATGACTGCTTCCTGCTCTGANCTTGCTCN 
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RCT-127 




NTNNNNTCTATGACATGATTACGAAT^ 

AAGAATTCGGCACGAGGCTGTTTGTGGTGTAGGGACAATGGCGCCATTGTTCTTCGCGTCTTCAA 

AGCAGTTTCCAGAGGAAACAAAGATGGCGGGGAGCTCAAAGCATCCGAGAGAATTCGTGAAGGGA 

GGAAAGTTGCTTAAAACTTGACCATGTTAAAGTTCTTACTCCTGGCACGTCATGATACTTTCTGC 

CAGCAGTGAATGCCAGCCCATTGAGATGCCCGTCCTCTGGGAAGCTCCTGGAGCAGAAATGCATG 

GCCAGTGAGAGGAGACTTCTTGCGAAGGAAAACTGGATGTGGTGATGTTCAGAAAAAGCAGACAC 

AAGAAGCAAATACAGAGTATTAGTACTGGGATGTTTGAGATGGGGTCTGCATGGCCCTTGCTGTG 

TCCCTCTTAAACTGTCTAGTGGATGGGACAGCATCTGGTATGGCCTGCAGAGTTGAGCCTTGTGT 

ATACCTTTTGAAGGCTCTATGTTAATGTATGACAGAGTGTTCTATTCTGGGAAGGTT 

GCTGAGGATACAACTTTACAGACTAGCACAATGAGTGTACAATTACTGTAAAAAAGGCAGTTCAC 

ATTTTTATTGGTTATGAATTGNTTCCTTATTCATCCAACAGTC 

AACT 


RCT-128 




TATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGACAAGAATTC 

GGCACGAGGATACTATTAGACCAAGCAGCTCTCAAGTGTTTAATGTCTACTGTGACACCCAATCA 

GGCACTCCACGGACATTAATTCAACACCGGAAAGATGGCTCTCAAAACTTCAACCAAACGTGGGA 

AAACTACGAAAAGGK5TTTTGGGAGGCTTGATGGAGAATTCTGGTTGGGCCTAGAG 

CTATAGTCAAACAGTCTAACTACATCTTACGACTGGAGCTACAAGACTGGAAGGACAGCAAGCAC 

TATGCTGAATATTCCTTTCATCTGGGCAATCATGAAACCAACTACACGCTACATGTGGCTGAGAT 

TGCTGCCAATATCCCTGAGGCCCTACCAGAACACAGAGACCTGATGTTTTCTACATGGGATCACA 

GAGCAAAGGGACAGCTCTACTGTCCAGAAAGTTATTCAGGTGGCTGGTGGTTCAGTGACATGTGT 

GG AG AAAAC AAC CTAAATGGTAAATAC AAC AAACCC AGAGC C AAATCC AAACC AGAGCGG AGAAG 

AGGAATCTCCTGGAGGCCTCGGGGCGGAAAGCTCTACTCTATCAAATCATCTAAAATGATGCTCC 

AGCAGACCACCTAAGGAACGTCAGCTGAACTGAGACAAATTAAAAGACCAACACATTCAATATTA 

AAATCC 


RCT-129 




TTATGACATGATTATGAATTTAATACGACTCACTATAGGGAA^ 

CGGCACGAGGGTGATGGTGGTGATGATCACGTGCCTGCTGAGCCACTAGAAGCTGTCAGCCCGCT 
CCTTCATCAGCCGACACAGCCAGGCCCGGAGGAGAGACGACGGAGGACTGTCCTCGGAAGGATGC 
CTCTGGCCGTCGGAGAGCACGGTGTCAGGCGGAATGCCGGAGCCACAGGTCTATGCCCCACCTCG 
GCCCACTGACCGCCTCGCTGTGCCCCCCTTCGTCCAGCGGAGCCGTTTCCAGCCCACCTACCCCT 
ACCTGCAGCACGAAATTGCCCTGCCACCCACCATCTCGCTGTCTGATGGGGAGGAGCCCCCACCC 
TACCAGGGCCCCTGCACCCTCCAGTTACGGGACCCGGAGCAACAGCTGGAGCTGAACCGAGAGTC 
TGTGCGCGCGCCCCCCAACCGGACCATCTTCGACAGTGACCTCATAGACAGCAGCATGCTGGGGG 
GCCCCTGTCCCCCCAGCAGTAACTCCGGCATCAGCGCTACCTGCTATACAGCGGNGGGCGCATGG 
AGGGGCCTCCTCCCACXrTACAGTGAGGTCATTGGTCACTACCGGGGCTCTCCTTCAGCAGCACCA 
GCAAAGTNAANGGCAGTCCTCCT 


RCT-137 




TTATGAC ATGATTACGAATTTAATAC G ACTC ACT ATAGGGAATTTGGC CCTCGAGGCC AAGAATT 

CGGCACGAGGGTGATCTCGGTTCGTCGCCACCATGGGGAAGCGACAGCACCAGAAGGACAAGATG 

TACATCACCTGTGCAGAGTACACTCATTTCTATGGTGGCAGGAAGCCAGATATCACACAGACAAG 

TTTTCGCCGCTTACCTTTTGACCATTGCAGTCTCTCTCTCCAGCCTTTTGTCTACCC 

CCCCGGAAGGTGTCGTCTTCGACTTGCTGAACATTGTCCCATGGCTTAAGAAGTATGGGACCGAT 

CCGAGCACTGGAGAGAAACTTGATGGGAAGTCCCTGATTAAGCTGAACTTTGCAAAGAACAGTGA 

AGGGCAGTACCACTGTCCAGTGCTGTACTCCGTGTTCACTGACAACACCCACATCGTGGCCATCA 

GGACAACTGGCAATGTCTACACCTATGAGGCAGTGGAGCAGCTAAACATCAAGGCCAAGAACTTG 

AGKMACCTGTTAACTGATGAGCCCTTTTCCAGGCAAGACATCATCACCCTGCAGGACCCCACC^ 

TCTGGACAAATTTTAACGTCAGCAGCTTCTTCCATGTAAAGAATAACATGCGAATGATAGATCCA 

G ATGAGG AAAAGGC C AAACC AG ACCCATCTTTNTATTTG 


RCT-138 




TCTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAAT 

TCGGCACGAGGGTCTGGCCACCATGGGGGCCCCAGAGCCCTCCTGGTGCTTTCTGTTCCTTCCTG 

TCCTCCTGACTGTGGGAGGATTAAGTCCCGTACAGGCCCAGAGTGGTAAGCCATAACACCCCTGG 

TCTTTCTCTCTTCCTCTCAAGATTTCCTCAGGCTACCCCTTTTCCTTCTAGCTC 

TAACGCCGAGCCCTGATTGTTAACCTGTGTCTCCCTCTTCATCCTTCTGAGACAATTACCCAGGA 

TGCGAATGTTCCTCTGTGAGCCCGGGTGTACTGGCTGGGATTGTGCTGGGCGACTTGGTGCTC 

TCTGCTCATCGCCCTGGCGGTGTACTCTCTGGGCCGCCTGGTCTCTCGAGGCCGAGGGACTGCAG 

ACGGGACCCGGAAACAGCACATGGCTGAGACTCAGTCACCTTATCAGGAGCTTCAGGGTCAGAGG 

CCAGAAGTATACAGTGACCTCAACACACAGAGGCAGTATTACAGATGAACCCACCCTATGCCCAC 

CAACAACCTGATGCCCGGATCCACTCATTCCAGACGCTTACTCAACAAACCCTCCCTGGGATCAG 

GACTCCCGCTGGAAATAAATATCCACAGAATGGCCTCTGGGAGATAT 
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RCT-139 




3NGCTAAGNOTATGACATGATTACGAATTTAATACGACTCACTATANNGAATTTGGCCCTCGAW 
^CAAGAATTCGGCACGAGGAAAGGTTTTTTTTTTTTTTTTT^ 

rCTGAGTAGGAACTGTAGGTTTATAAAATACATGGNAACTGAAGTTTCTGAGATGGGACTTGCTC 

TX3TTAGCCAGGCTGATGTGGCTGGTGCTTGTCCCANCGCTAGGGAGGAGAGAGATGTGTANATTT 

CCTAGGACATGTGGCCAGCCAGCCAGCCTGCTCAGGGACCCACCTGCCTCCATCGTCCCAGGCGC 

TCATTACCCATATGCATGGTAGGTTCTAAAATTGAATTCAGGCCCTCAGGCTTGCAAGTTAAACA 

TGACATGAGCAGTCTCCCTGGCCCACTGGTTTCATGTGATCCCTAAANGCATTTCCCTTTGGCAT 

ANTAATCATGTTTTGACAATGCTACCATCCTTTGGGTATAAACTTACATGGTTTCCC 

TTGTTTAAAATGTGAATATTCTAAAAATGTTTAGAGACTCNTTCOT 

ATATTTTNCTAAGTAGAAAATAAACTTTCNGANTAAAACTCTGTAACCC^ 

NNNCCCNNNNNCNNNAimNNNNN^ 


RCT-14 




CN^CTONNTCTATCACATGATTACGAAT^ 

GCC^GAATTCGGCACGAGGATTAGGCACGCCCCGGAGTGGGCAAAGACCTGGCACGAAGGCACT 

CTTGCACATGCGCACACAGTTAACTTCCTGAAAGGAAGTCGGCTGGCGCGGCGGAGGCCAGCGCC 

ATCTTGTAATGGCAAAAGCTATCCCGGCTTTCCACGTGGGGCACAGTGGAAGCCAGCGCCATCTT 

GTGGTGGTGAATGTTATTGCGGCTCTCCACAGGTCCTTACTTGGGTTTAAA 

GTGCTGGTGTGTATTGTGTATATGCCACAGCCCAAGTGTTAAAAGAAGAGAAAAATTTGTAGG^ 

TAAGATGGGCATTTCAGGGATTGAACTCAGATTATTAGGTTTAGCAACAACCAACTTTACCTACG 

GAGATAATTCTCTGGCATTCATTTGTAGTTCTGAGAAAGTGTTATGAGCAAATGTTTACTCAGAA 

TAAATTAACAAATTAAAAAAAAAACAGTAGTCTGGGTTCGGTCCCCAGCTCCAAAAAAAAAAAAA 

AAAAAAAAAAC AATTGC GGCCG C AAGCTT ATTC C CTTT AAGGG AGGGTTAATTTT AGCTTGGC AC 

TGGCCGTCGTTTTACAACGTCGTGACTGGGAAAACCTGGCGTTACCCAACTTAATCCCCTTC 

CACATC 


RCT-140 




TATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAG^ 

GGCACGAGGACCGGGAAGCCCCAGCCTGTGATGGTAACTTGGGTGAGAGTCGATGACGAAATGCC 

TCAACATGCCGTTCTGTCTGGGCCGAATCTGTTCATCAATAACCTAAACAAAACAGACAATGGTA 

CTTACCGCTGTGAGGCTTCCAACACAGTGGGGAAAGCTCATTCAGACTATATGCTGTATGTATAC 

GATCCCCCCACAACTATCCCTCCTCCCACAACAACCACCACCACCACTACCACCACCACCACCAC 

CACCACCATCCTTACCATCATCACAGACACCACGGCGACGACAGAACCAGCAGTTCACGATTCTC 

GAGCAGGTGAAGAGGGCGCCATTCGGGCAGTGGACCACGCGGTGATTGGCGGCGTCGTAGCCGTG 

GTGGTGTTTGCCATGCTATGTTTGCTCATCATTCTCGGCCGCTATTTTC 

ATACTTCACTCATGAAGCCAAAGGAGCCGATGACGCAGCAGACGCAGACACAGCTATAATCAATG 
CAGAAGGAGGACAGAACAACTCCGAAGAAAAGAAAGAGTACTTCATCTAGATCAGCCTTTTTGTT 
CCAATGANGTGTCCAACNGGCCTGCTTAGATGATAAAGAGACAGTGATNCTGGGAAAAA 


RCT-141 




ATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC 

GCACGAGGAAACAAACTCCAAACTCTGAAACAGCTGAAGTGAATCCAGCTCATGAAGATGCAGAT 

GGAGGTGAAGGAGAAAAACCTCTGATTCCCAGGCCCCCTGTGCTATCCCCCACTGCTGTTCCAGG 

AACCGATCTCTTGGTTGAGAGACTCAATCCAGGCATTAACATCCATCCCATGTTTTCAGATGAGA 

CCAATATATGCAATGGTAAGCCAGTGGATGGACTGACCACTCTGCGCAACGGGACGTTAGTTGCA 

TTTCGAGGTCATTATTTCTGGATGCTGAACCCATTCAGACCACCATCTCCACCACGCAGAATC 

TGAAGTCTGGGGTATTCCCTCTCCCCTTGACACAGTTTTTACTAGATGTAACTGTC 

CTTTCTTCTTTAAGGATTCTCAGTACTGGCGCTTTACCAACGATGTAATGGATGCTGGGTATCCT 

AAACTAATTGTCAAAGGCTTTGGAGGACTAACAGGGAAGATAGTGGCTGCTCTTTCAATAGCCAA 

GTAC AAGG AC AGAC C TC AATC TGTGTACTTCTTC AAGANANGTGGC AAC ATCCAGC AGTAC ACTT 

ACNAACAGGAGCCCCGTGANGAAATGCACAGGGAGGCAGCCTGNCATC 


RCT-142 




TCTCCTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAG 
AATTCGGCACGAGGCTCCTGGGGGTACTTCCTATGAGAGATACGAGTGCTACAAGGTTCCAGAAT 
GGTGCTTAGATTACTGGCATCCTTCTGAGAAAGCAGTGTATCCTGATTACTTTTCCAAGAGAGAG 
CAGTGGAAGAAACTGAGGATGTGGAGCTGGGATCGGGAGGTTAAACAGCTGGAGGAAGAAACGTC 
ACCTGATGGTATTATGACTGAAGCTTTGCCTCCTGCCAGAAAGGAAGGCGACTTGCCCCCATTGT 
GGTGGCATATTGTGACCAGACCTCGGGAACGGCCCACATAGAGACAGGCACCGCACTGTTCATGC 
TTGCAAGTGAGAGTTACAGAACACATTCACACTTGCCCTAATAAAAGTAACTAGAGACCANNNAA 
NAAAGAANNAAAAAAAAAAAAAAGAAAAAAAAANGTTGTGNGGC CGC AAGCTT ATTCCCTTTAGT 
GAGGGTTAATTTTANCTTGGCNCTGGCCGTCGTTTTACAACGTCGTGACTGGGAAAACCCTGGCG 
TTACCCAACTTAATCGCCTTGCAGCACATCCCCCTTTCCCAGCTGGCGTAATAGCGAAAANGCCC 
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RCT-143 




TATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC 
GGCACGAGGTGATGTACCGCCTGAGCTCGTCAGTGCTGCCCCGGGCCTTGGCCCAGGCCATGCGC 
ACAGGACATCTTAATGGCCAAAGCCTTCATAGCAGTGCAGTGGCCGCTACGTACAAGTATGTGAA 
T ATGAAGGC AC AGG AAC TTG ATGTGG AC ATG AAGTC TG CG AC TG AC AGTGC AGCTC GG ATTCTGA 
TGTGGACAGAACTCTTCCGAGGCCTGGGCATGACCCTAAGCTACCTCTTTCGGGAGCCCGCCACC 
ATCAACTACCCCT1TGAGAAGGGCCCACTGAGTCCGCGCTTCCGTGGGGAGCATGCACTGCGCCG 
CTACCCGTCTGGGGAGGAGCGTTGCATCGCCTGCAAGCTCTGTGAGGCCATCTGTCCTGCACAGG 
CCATCACCATTGAGGCTGAGCCAAGAGCAGATGGCAGCCGCCGGACTACACGCTATGACATTGAC 
ATGACCAAGTGTATCTACTGTGGTTTCTGCCAGGAAGCCTGCCCTGTTGACGCTATCGTGGAG^ 
CCCCAACTTTTGAGTTCTCCACCGAGACGCATGAGGAGTTGCTGTACAACAAGGAGAAACTACTC 
AACNATGGTGGACA 


RCT-144 




TTATNANATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGA^ 

CGGCACGAGGGATTCCCCCTTCAGGAGGTAAATGAAAAACCTAAGAAGAAAAAGCTAAAACCCCA 

GGAAACCCTACAGGAAAATGGAATGGAAGACCCACCTGTCTCTTTGCCTAAAACCAAGAAAAAGA 

AAGCTTTTCCCAAGGAGGAGTTGGCCAGTGATCTTGAAGAGATGGCTACTAGCAGCATAAGTGOT 

CCTAAGAAAAAGAAGTCCTTACCTAAAGAGGAAGTGGCCAGTGAACCTGAAGAGGCAGCAAGCCC 

CATCACCCCTAAGAAGAAAAGGAAATTTTCTGAGGAGCCTGAGGCTGCTGCAAGCTGCACAAAGA 

GCAGCACAAAGAAAAAGAAAAAGTCGCAGAAGGCCCGGGAGGAGGATTAGAATGGACCTGCTTGG 

TGGGAGGGGCATACTTTATGGTGGCAGTTCCTCCTGCCCATGATAAACCCCAATAAAAACGCAAA 

acccgaaaannnanannnnnaannnnnnnn^^ 

CAAGCTTATTCCCTTTAGTGAGGGTTAATTTTAGCTTTC 

GACTGGGAAAACCCTGGCGTTCCCAACTTTATTCGCCTTGCAGCACTTCCCCTTTCGCCA^ 


RCT-145 




TATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC 
GGCACGAGGGCCGAACCTCGCGGGGCGGGAAGCCGCGAGATGGACACCCCTCCGCTCTCAGACTC 
GGACTCCGGGTCGGATGAGTGCCTGGCCTCAGATCAAGAGTTGCAGGATGCGTTTTCCCGCGGAC 
TCCTAAAGCCAGGCCTCAATGTCGTGCTAGAGAAGCCGAAGAAGGCGGTGAATGACGAGAATGGC 
CTGAAGCAGTGCTTGGCTGAATTCAAACGGGATCTGGAGTGGGTTGAAAGGCTCGATGTGACCCT 
GGGTCCTGTGCCTGAAGCCAGTGAAACTCAGTCAACACCCCAGAACAAGGACCAGAAGAAAGGTG 
TTAATCCAGAAGACGACTTCCAGAGGGAAATGAGTTTCTACCGCCAGGCCCAGGCTGCTGTGCTT 
GCAGTATTACCCCGACTCCACCAGCTCCAAGTCCCTACGAAGAGGCCCACTGATTATTTTGCAGA 
AATGGCCAAGTCTGATCAACAGATGCAAAAGATTCGACAGAAGCTGCAGACTAAACAGGCTGCCA 
TGGAG AAATC TGAAAAGGC C AAGCAACTTCG AGCGCTTAGG AAATACGGAAAG AAGGTGC AAACT 
G AGGTC C TTC AG AAG AAGC AG C AGG AG AAAGCGC AT ATG A 


RCT-146 




GGGGGGGTCTCNANATGATTCGAATTTAATACGACTCACTATAGGGAATTTGGCCCTC 

AGAATTCGGCACGAGGGACATCTGTAGCTGGGGAGTCAGTTAANGTGGTCTCTTCCTGCGCGAAC 

ATGGTTCGGACCAAAGCAAACTACGTCCNGGGAGCCTACAGNNNAGTGGTGGCTTCTCAAGCCCC 

TAGGAAGGTGCTTGGCTCCTCCACCTTTGTCACCAATTCTTCCGGTTCGTCNAGAAAAGCTGAAA 

ATAAGTATGCTGGAGGGAACCCAGTCTGTGTGCGCCCAACTCCCNANTGGCAAAAAGGNATCGGN 

GAATTCTTCAAGCTGNCCCCTCANGATTCTAANGGATAAAACCNGATTCOTG 

CTGNGGCNTTTNAAAATNCNATANTNGATNAC 


RCT-147 




tatcacatgaitacgaatttaatacgactcactat^^ 

GGCACGAGGCTCAAATGTATTTATTT AAGTC TG AGC CTTCCTTTCC AGTTTTTAGACC AAC ACTG 
CTCACCCCTGCCTACATCCCTGGCCCAAGAGGAAACTGTATAAGGCCTCTGGGCTCCCGTGGGGG 
AGGGCCCAGGAGCGGCAGGACCCCTGTGCCTAAGACACCACCAGAACCAGAAGGAAACAGACCGG 
ATAAACAGATCTCTGCACCCAAATCCCGTGGGAGGGAGAGCTGAACCTTCAGAGACGCAGACAAG 
CCTGGGAAACCAGAAGAGACTGCTCTCTACACACACATCTCGGACATGGTGTGGCCACCACCCAC 
GTTCCTCTCTGGCCCTGGAGCCCCAGTGGGCTGCATCACCCACCGCTGGTGCGTGGTGCATGGCT 
GTGCCCGAGCAGGCTCAGGCAGCTCTCCTCACCCACTGCACCTGCCATCACACCTTCTGTGGAGC 
TACTTAATAAACACAGCACACTGTCAAGTGTTTTTAAATCCAAAAAAAAAAAAAAAAAAAAAAAA 
AGTTGTGCGGCCGCAAGCTTATTCCCTTTAGTGAGGGGTAATTTTANCTTGGGCACTGGCCGTCG 
TTTTACAACGTCGTGACTGGGAAAACCCTGGCGTTTCCCCAACTTGNGGGGNGNGGNGG 


RCT-148 




TTTTTTTTrri'l^TTTTTTTTTTTTC 

GCTCACTTAAGAATGGCACAAACAACAAGCAAGGTAGTAGTGAGATACTGCTCTGCAGTTCTCGA 

TGGTCTCATCATGGCCTTGGAGAGTTGGGACCCAGAGCAGAGCGAAGCTAGGCTCCTCAGAAGGA 

GGACCCCGACTGTGGAGGAAGGCCTTTAGGGCTAGCCTTCAGATCCAGATGTCAGAACTGCAATC 

ACCCCCTGGGTAACGAAGCTCATGAGCCAGTGCTGGCCCAAGAGGCTCTTTCCCAAAGTCCACCA 

GAAAGTTGGGGTTCAACTTCAGCCCTCCATTTGCTGTATCTACATCAATTTC 

CCTTCCCTAATGAGATTAGGGTAAAACTGCTTGTCCCAGGCGCTGTACAGTGATGTAGTGACGTA 

AAGACGCTTCCCATCTAAGCTGAGCTGGATCATCTGAGGACCTCCAGGAACTCGTTTTCCCTTGA 

CCACTAGGGGCTCCGGCTG 
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RCT-149 




CTCTATGACATCATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAA 

TTCGGCACGAGGGGCGCGCCTACTCGGACATGAGAAAGGCCAACTGGAAAAACTCAGACAAATAC 

TTCCATGCTCGGGGGAACTATGATGCCGCCCGAAGGGGTCCTGGGGGAGCCTGGGCTGCTAAAGT 

CATCAGTGATGCCAGAGAGGGTATTCAGAGATTAACAGGACACGGAGCAGAGGACTCAAGAGCTG 

ACCAGTTTGCCAACAAGTGGGGCCGGAGTGGCAAAGACCCCAACCACTTTCGACCTGCTGGTCTG 

CCCAGGAAATACTGAATTTTCTCTTCATGTTGTTCCCGGGCGCACAGCCCCCCAAGGAAAGGGGC 

AATT ACTGAGTTGAGTTATTTC CTAAAACCTGGATCCCT AAAC ATC CC AATGTGCTG AATAAATG 

CTTGTGAAATGCAI^ANANr^AAAAAAAAAmAAANAAAAAAAAAAANA 

TGAGCGGCCGCAAGCTTATTCCCTTTAGTGAGGGTTAATTTTAGCTTGGCACT 

ACAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAACTTAATCGCCTTGCAGCANATCCCCCTT 

TTGCCAGCTGGCGTAATACCGAANAGGCCCGCCCCGATCGCCCTTCCCACAGTTG 


RCT-151 




TATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC 
GGCACGAGGGCCTGTTGAAAGGGCTGGGCCCCGCCGGCCCTTTTGAAATGGTGTACTGGACAGGA 
GACATCCCTGCCCATGATGTCTGGCAACAGTCTCGACAAGATCAGCTGAGGGCCCTGAACACCAT 
CAC AG AC CTCGTG AGG AAGTTC TTGGGCC CTGTACCGGTGT AC C CTGCTGTGGGC AAC C ATG AGA 
GTACTCCTGTCAATGGCTTCCCTCCCCCCTTCATAAAGGGAAACCAGTCTTCAC7VATGTCTTTAT 
GAAGCCATGGCCAAGGCATGGGAACCCTGGTTACCAGCTGACGCCCTTCACACCCTGGTCTACCG 
CATGAGGGCTGATGAGCAGCTCTTCCAGACCTTCTGGTTTCTCTACCATAAGGGCCACCCACCTT 
CAGAGCCCTGCGGCACACCCTGCCGCCTGGCCACTCTGTGTGCCCAGCTCTCAGCACGTGCAGAC 
AGCCCTCCTCTGTGTCGCCACTTGATGCCCAATGGGAGCCTCCCAGATGCCCATAGCTTGTGGTC 
AC GGC C C CTG CTGTG C T AGTGTGGG AAAAGTTC AC AT ATT AGC AAAGGG ATGG ATTC C TG AGT AT 
CGCTGATCTACCTGAGGCAAANCTTTCNGGGAAGGAGGGGGGGGGGNGN 


RCT-152 




TCCNAAGTGATTNTGCCNANAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC 
GGCACGAGGATCGGTTTXrGGAGACCTGAAAACCCCCGCCGGCCTCCAGGTGCTCAACGATTACCT 
GGCGGACAAGAGCTACATTGAGGGGTACGTGCCATCACAAGCCGATGTGGCAGTATTTGAAGCAA 
TCTCTGGTCCACCACCCGCTGACCTGTGTCATGCTCTGCGTTGGTATAATCATATCAAATCTTAC 
GAAAAAGAT^GGCCAGCTTGCCGGGAGTGAAGAAATCTTTGGGCAAGTATGGCCCTGTCAGTGT 
GGCAGATACCACAGGAAGTGGAGCAGCAGATGCTAAAGACGATGATGACATTGATCTCTTCGGAT 
CTCATGATGAGGAGGAAAGTGAAGACGCAAAGAGGCTACGAGAAGAACGCCTTGCTCAGTATGAG 
TCAAAGAAAGCTAAAAAGCCTGCAGTTCTTGCGAAGTCTTCCATCTTC 

GGACGATGAGACAGACATGACGAAACTTGAGGAGTGTGTCCGAAGCATTCAAGCGGACGGCTTGG 
TGTGGGGCTCCTCTAAATTGGTTCCAGTGGGATACGGAATTAAAAAGCTTCAAATACAGTGTGTA 
GTTTGAAGATGATAAGGTTGGAACAGATATGCTGGAAANANCANATTACTGCTTTT^ 


! RCT-153 




TATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTC 

GGCACGAGGGATGTAAAAGAAGCCATAAGAAGGCTGCCTGAGAACCTTTATAATGACAGAATGTT 

TCGAATTAAGAGAGCCCTAGACCTGTCTATGCGGCATCAGATCTTGCCTAAGGATCAGTGGACAA 

AATATGAGGAGGACAAATTCTACCTTGAACCCTATCTGAAAGAGGTTATTCGGGAAAGAAAGGAG 

AGAGAAGAGTGGGCGAAGAAGTGATCGTGTAGTTAAGATCTGTGGGTGCGCCTGGTCTCACCCTA 

TTTTATGACATTGTTTCAACCTGAATCACAACTTAAGAATCATTTGCTC 

TAAATAAATGTCTATTATAACCGTAAAAAAAAAAAAAAAAAAAAAAAAATGGAAGCGGCCGCAAG 
CTTATTCCCTTTAGTGAGGGTTAATTTTAGCTTGGCACT 

GGNAAACCCTGGCGTTACCCAACTTAATCGCCTTGCANCACATCCCCCTTTCCCAGNTGGCGTAA 
TANCNAANAGCCCGCCCGATCGCCCTTTCCCAACAGTTGCGCNAGCCTGAATGNCNAATGGGAAC 
NCNCCCTNTANCGGCGCNTTAAGCCGGCNGGNGGNGNTGGGCCCCCCCCCNTGNCCCCC 


RCT-155 




NCTCOTAACATGATTACGCATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAA 

TTCGGCACGAGGATTACGAGAGACCTGTTCTGCACCTGGTTGCTCTCAACACGCCGGTGGCTGGG 

GACATCCGAGCAGATTTCCAGTGTTTCCAGCAGGCCAGGGCTX3CAGGACTACTGTCCACCTTCCG 

AGCGTTTCTGTCATCACACTTGCAGGATCTCTCCACAGTTGTGCGG 

TTCCAATTGTGAACCTCAAGGGCCAAGTGCTTTTTAACAACTGGGACTCAATATTTTC 

GGAGGTCAATTCAATACACACATTCCGATATACTCCTTTGATGGTCGGGACGTGATGACTGATCC 

TTCCTGGCCGCAGAAGGTTGTTTGGCACGGCTCCAACCCCCACGGTGTCCGCCTCGTGGACAAGT 

ACTGTGAAGCCTGGCGAACCACGGACATGGCAGTAACAGGATTTGCCTCCCCACTGAGCACAGGG 

AAGATTCTGGACCAGAAAGCATATAGCTGTCCTAATAGGCTAATCGTTCTGTGCATCGAAAACAG 

TTTCATGACAGACGCAAGGGAAGTGATAACCTCCCCATGGTTCTTAAAAGAATATTCTAATATTO 

CTTATGTGAAAAGTTGACACTGNAATCTAAAAAANNNNANNNNATNNAOT 
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RCT-158 




ANGACATGATTACGAATTTAATACGACTCACTATAGGGGAATTTGGCCCTCGAGGCCAAGAATTC 
GGCACGAGGGCAAGCGAACAGGGTCTAGCAAAGAGGAGCTACGGANACAGACAGACATTTTAAGT 
TTTCCAAAGAATCATCACCTTCTGCCGCAGGTCGCTTTC 

CCANCCGGACTGTCTGACGAGTCAGGCATTTCGTCCACCAAATGCCGGTCCTCANAGTTTGCCTG 

AGACCCAATTGAAGGCACCGCCTGGCGGCTCCCGCTGACATCCAAGCTCTCCTGCGCCGGCACCT 

TGCAGGCGCTCTTGGGGGGGCGCGGGGGTCTGTAGTAGAACTCGGGCAAGCTGCCCCTCTCCACC 

TCCTGCCACTCGTATCTGCCCTCCAGGGGCTTATGATTCTGAAAGTCGAAATTCCACTTGCGCTG 

GCTCGCTTCTTCCATATCTCGGCAGTGCTTCTCCAAGTCCCGGGTTAATTCTTCATGATTGACCG 

GGCCGAANAAGTTTCTGCANGCGGAAGGCTTGGGGTGCTCGGTTTGTCTGGCGTTCATCC 

AGGCTCGGGCTCCGTTANACACTCTCACGTTTGACATCTTCCTCCCCGGGCGGGNGTGGACACCG 

CCTCTNCTCTCTCCGAAAAAAAAAAAAAAAAAAAAAAAACATTGCGGCCNCAA 


RCT-161 




TCTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAAT 
TCGGCACGAGGGAGCCACTGGGTAACCTGGCCAGCAACCTTCTCTGAAGCTGAATCAAAAACTAA 
ATAGGAGAATATGGCAAGTGCAGACTGGGGATATGAAAGCAAAAATGGTCCTGACCAATGGAGCA 
AACTATATCCCATTGCCAATGGAAACAACCAGTCTCCTATTGATATTAAAACCAGCGAAGCCAAA 

catgattcctctctgaaaccagtcagcgtctcctacaatcctgcaactgccaaagaaattgttaa 
tgtgggacattctttccatgtagtttttgatgac^^ 

ctcttgctgatagctatcggctcacccagttccattttcactggggcaactcaaacgaccatggc 

tctgaacacaccgtggatggagccaaatattctggagagcttcacttagttcactggaattcagc 

caagtactccagtgctgctgaagccatctcgaaggctgatgggctgg 

tgaanggtgggtccagccaacccnaacctgk:anaaagtactggatgccctaanctcagttaaaac 

taanggaaaacnanccccattcnccaattttgaccttccagtctccttcctt 


C4 

to 

r-» 

1 




atgacatgattacgaatttaatacgactcactatagggaatttggccctcgaggccaagaattcg 
gcacgaggaatttaagcatattagtcagcggagaagcttcggcgagcagaagtggacttggagcg 
cgtgcgggtgtggtacaagctggacgagctgtttgagcaggaacgtaatgttcgcacagccatga 
ccaacagagcaggattgcttgccctgatgctgcaccaaaccatccaacacgatccacttactacc 
gaccttcgctctagtgctgaccgctgaaagtcaccagcccagagk:ctctcagccctgcattcagt 

CAGGGAGGGGCTCTGCATTTCAGCTCGCTCTTCCTCCGTTCATCTGTTTATTC 
TTTTCTTCTTACCATCCATGTTTTGGCTTCTGTTTGCCCTTATC 

TTGTCTCCTCTCCATAGTCAGTGCTGGGTGAAAGTCAAGTTTACTCAGCCTTGCCTATACCCTCC 
CCCAAAATAAACAGGTTTTGTTAATAAAATTTTGAACAAGAATAAAAAAAAAAA 
ACAATTGCGGCCGCAAGCTTATTCCCTTTAGNGANGGTTAATTTTACTTGGCACTGGCC 
TTACAACGTCGTGGACTGGGAAAAACCTCGCGGTTACCCAACTTAATCGCCTTGC 


RCT-164 




NNCNAATCTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCC 

AAGAATTCGGCACGAGGATCl^ACTGCCCAGTTGTTCTGGCATTCGAAAAAAGGACTGTAGACTA 

TGGTCTAATGTTCAAGGATGTGGATGGACAGGACTGTGGAAAATAGTGAGAAACTGGTTCTCCGC 

TGGAGGAAGTAGGGTTAGGTTTAGGAC CTTTGCAAGTGGGGGTC AGGC AC ACC AC C AGGAAC AGC 

TCCTTCGATAAATAAAACAGTTATCACATTCCCACAACAACCTAAACACAGACTACCTCTTCCCT 

TACCAGATCTGCTAAGCTGTGAGGTTCTAAGAGGTCTGAGTGTGTTGTTTAAACTT^ 

GTTTTTCATTATAACAATTCAAGCCAGACATTTGAAAGTGACTGT^ 

TACTAACTGAGGGCCCACCTTCTAGTTTCTAGTTGCACTTCTGAAATCCCATGTTTTC 

GGACCTGAGTTTGGTGGCTTTTAAAATACAGATGAGATCAATTTATCCGACTTCATGAGTNATCC 

TNC ATTCTCTCG AG AAAAANCTGATTTCN ATGTNAATTAATTGT AC TATG AC C 


RCT-165 




TATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC 

GGC ACGAGG ATTAAAGAC AGGGGGGGCTACCTGAAC AAAGTCTG C AACCTCCTGC C C ATTAGGAT 

CCTGTCCTACATCT/IGCTGCCCTGCACTCTGCCCGTGGAGTCGGCCATCGCTGCAGTCCACAGGC 

TGGTGATGTGGCTCCCTGATATCCATGAAGATATCCAGTGGCTACAGTGGGCAACATCCCAGGTG 

TGTGCCCGAATGACCATCTGCCTGCTCCCCTCTACCAGATCCAGAGCATCCAAGGATAACCATCA 

AACACTCAAG^TGGATATCACCCATCTCTCCACAAACCCCAAGGCAGCTCTGCCGGTTTGTAAA 

TTGCTGGTCTCCGTGCTTCCGATGAACTTGGGCATTCTCCCTGTGGATGGTTCCAGGAGAGGCCA 

TAGCTGAAGGCACTCTGCCTTCCACCCCAAGTCCAGTTTGACCTTTATCTAGAGCAACAGTGTCT 

AGATGATAGGTGGGTGGGGGGTGCTGTCTCTCTGTTTCCCTCTGGGAAGGGTTOTGT^ 

GGAGGCAGCTAGGAAATTTCTCTTCAGGAGCTGAGCCTGTGCAGCTGCCCCCTTGGTGCTGTGTG 

GTAACCTCATTGC 
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RCT-166 


C 

c 

r 

( 
( 


^TTCANATCATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCC^G^TTC 
SGCA^AGGCGCCGTTGCAGCTGCTTCTGCCTCTGTCCACCGAGGCAGTTCACCCGAGGCCGATC 
CCGAGGTC^CAGCGTCTACTTCCCACAGCCTCCGCCATCMGTCTGGAGCTCTA^^ACCT 
ATCTCCCAG^CTCCCGTGCCGTCTACATCTTCGCCAAGAAGAACGGCATCCCCTTCCAGCTGC 

^GGTGCCGGCTTTGAAGGATGGCWACTTCGTC 

^GTAG^GTACAAGGCACCTGACCACTGGTACCCTCAGGACCTACAGACCCGAGCTCGTGTGG 
fVTGATGTTCCCTGTGTTCCTGGGACAGCCGGTTC 

ftArTfiRRTGGATGCCTGCAGATGCTGGAGGACAAGTTCCTGCAGAACAA 


RCT-171 




Etatgacatgattacgaattnatacgactcactataggggaatttggccctcgaggccaagaat-i' 

CGGCACGAGGCTTTCCATCCATATTACCACCCTGGTTATTTCCCAAGCCAGCTCCACCCCCTCTA 
CTCTOA^Sc^ACCCTC^^ 

^AACTCT^^'ATTGCTATTATGCTTAGGTTCAGCATTGGATATATGCACGCTGATTCCTTTAATGA 
TCAAGTCCTCTCCACAAAGAGACTGGGCAACCTTATCATCTGCAAAGGTAACA 
CTG^AT^GTTTCGGA^TGAAGACATCTACCACTTCTCCATACTGACAGAAGAACTC 
^AGC^ATGTCCTCTGTACAACGTCCAAAAAAAACCTTTG 

StcagSaattt^^ 

^GCTACCCAACTTAATCGCCTTGCAGCACATCCCCCTTTCGCCAGCTGGCGTAATAGCGAANA 

GGCCCG^ACCGATCGCCCTTCCCAACAGTTGC 
ANCGGCGCATTAAGCGCGGGCGGGTGTGGTGGGT 


RCT-177 




CCNNNNN^CTCTATGACATGATTACGAATTTAATACGACTCACTATAG 

GGCC^^^TTCGGCACGAGGGAAGCCTCGGGGAGGAGAGAC^GCTCTGTTCCGTTTTTGTTrCT 
GGTGAAGAGTGTGGGGATTTTCGTCTTGGCTTCAAATCCGTCCAAGCAGCACAGCGGGTCAAAAT 
CTCCTGCAGGTCCTCAGATTACCAGTGACTAAACTGCACCTAGGCAGACCAGCCATGAGAGCCAC 
TCAGCAGGACTTCGAAAATGCAATGAACCAGGTGAAACTCTTGAAGAAGGACCCAGGAAATGAAG 
TGAAGCTGAGACTGTATGCGCTGTATAAGCAGGCCACAGAAGGACCCTGCACTATGCCTAAACCA 
GGTGTGTTTGACTTTGTCAATAAAGCCAAGTGGGATGCATGGAATGCTCTGGGCAGCCTACCAAA 

GGAAACTGCCAGGCAGAACTATGTGGACCTCGTATCCAGTTTGAGTTCCT 
GCCAGGGAAAGGGTGGAGCTGATGGGAAAGCCCAGGAGTCCAAGGGCATNCTGGTAACGTCTCAA 

^TG^TCACAAAGATCACGTTCAATCGGCCTCCAAAAAG^ 

CAGGATATTATACTCGCGCTTAAGAATGCCAGCACGGATGACACAGTNCATCACCGTTTTCACAN 


RCT-179 




TCCTCATAANATGATTACNAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGA 

ATTCGGCACGAGGGTTCTGCGGAACAGTAGGCAGTTGTTTTCCGTCCGGCTTCTCTCACAC 

GTGCGCGCCTCCACCTCATGGAAGACTCGATGGACATGGACATGAGCCCTCTTAGGTCTCAGAAC 

TACCTTTTCGGTTGTGAACTAAAGGCTGACAAAGATTATCACTTTAAAGTGGATAATGATGAAAA 

TCAGCACCAGTTATCATTAAGAACGGTCAGTTTAGGAGCAGGGGCAAAAGATGAGTTGCACATCG 

™gaggcagSaW 

TCTOTACAACCAACAGTTTCCCTTGGGGGCTTCGAAATTACACCACCTGTGGTCTTGAGGTTGAA 
GNGT<^TTCTGGGCCTGTGCACATAAGTGGACAGCACCTAGTAGCTGTAGA«3AAGA^ANAGT 
CANAAGATGAAGATGAGGAAGATGTNAAACTCTTAGCATGTCTGGAAANAGATCTG^ 
GTGGTAACAAAGTCCCACAGAAAAAAGTAAAACTTGATGAANATGATGATGANGATGATGAAGAT 

GATGAGGATGAATGAANATGATGATGATGATGATTTTGATC . 


RCT-18 




TTATGACATGATTACGAATTTAATACGACTCACrArAW»(jAAl I i\>VjC«-V,iv-v»*»»^^««v**w»i 
CGGCACGAGGAAGTTATACGCCCTGGGAATGGCTGCCCCAAAACTGAAATCATTTTCTGGACCAA 
GGCCAAGAAAGCTATATGTGTGAATCCTACTGCCAGATGGTTACCAAAAGTA^AAAAT^TCC 
GAAGCAGAAGTATTACTTCAACTCCCCAAGCTCCAGTGAGTAAGAAAAGAGCTGCCTGAAGCCAC 

TG^CACCCCAAAMAACCTGCACCCTTTCTTAATCCCTC 

GTTGAAGAATTTCCAAGAAAATAACTTCCCTCTACAAACACGGCTGTAGATT^ 

CCTGCAGTAGCTGAGAGGAGACACTCGAGCTCCTTCCCATACTCAACCCAT^ 

AGGGAGGATATTTTCGAGCAGGCATTTAGTGACAAGCCACTTTGGTAATAGACCTGTTGTTTAGT 

GTTAAACTATCCTAGACCTAGAGGAATAAAAGCATACATGTCGAATCTGAACCATAGCTCCTACT 

AACAAGAGGTTTATGAGATGGACTTCAGTTAGTTTGCACCCTTGCAAAAATCAGGC^ 

AGTTTCCAGAAAGTCCCTAAGAAGCAGACGCATTACCAGCCTAAGGNGANGCAGAGCAGGTCTCC 

NTTAGAGAGAATCTTCTGGAGGGAAATAATGNTTN 
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RCT-180 




TCTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAAT 
TCGGCACGAGGGTGGTGGCCAAGTTCAACGCCTCGCAGCTCATTACCCAGCGGGCTCAGGTGTCT 
CTGTTGATCCGAAGAGAGCTGACAGAGCGTGCCT^GGACTTCAGCCTCATCCTGGACGATGTAGC 
TATCACAGAGCTAAGCTTCAGCCGAGAGTACACAGCTGCTGTAGAAGCCAAACAAGTGGCCCAGC 
AGGAAGCCCAGAGGGCCCAGTTTTTGGTGGAGAAAGCAAAGCAGGAACAACGACAGAAGATTGTG 
CAGGCTGAGGGGGAGGCGGAGGCTGCTAAGATGCTTGGAGAAGCACTGAGCAAGAATCCTGGCTA 
TATCAAGCTCCGAAAGATCCGGGCTGCCCAGAACATCTCTAAAACGATCGCCACATCACAGAACC 
GGATCTATCTCACAGCTGACAACCTTGTGCTGAATCTGCAGGATGAAAGCTTTACTCGGGGAAGT 
GACAGCCTCAOTAAGGGTAAGAAGTGAGTGTGGACATCAAGAACCCCCACCACCAGAGAAGTTGG 
CACACTTCTCCAGTTTGGAGGGGCCAGCTTAGGGGGTCAAGCATACCCCANCCCTGACCCAAGCA 
TCATGNGATGGATTCTTCTGTATCTGCTCTCTTGGGATTAANGGAAACTGAAGAC 


RCT-181 




TATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC 
GGCACGAGGCTGAACATCTACGTGCCACCGTGCGCTGGGATTACCAGCCAGACATCTGTAAGGAC 
TACAAGGAAACTGGCTTCTGTGGTTTTGGAGACAGCTG^ 

CAAGCACGGATGGCAGATAGAACGGGAGCTCGATGAGGGCCGTTATGGTGTGTATGAAGATGAAA 

ACTATGAAGTAGAAAGCGATGATGAGGAAATACCATTCAAATGTTTCATCTGTCGCCAAACCTTC 

CAGAATCCGGTTGTTACTAAGTGTAAACATTATTTCTGCGAGACCTGTGCA^ 

AACCACTCCACGGTGCTATGTTTGTGAGCAACAGACCCATGGGGTTTTCAATCCTGCCAAAGAAC 

TGATTGCTAAACTGGAGAAATACCGAAAAGCGGAAGGTGGTACTTCTAACACTTCAGAAGACCCC 

GATGGAATCTAATTGCCTTTACTTAGATTTTTTGCAATTC 

TGGTCACTTCAAACCAAAGCAACCGTAATTGAAGAAATATTAAGTTTTCTAAGAAGAATATCCAA 
GTTTTGTATTATATGACAGTGCTACAGATGGGAA 


RCT-182 




TATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC 

GGCACGAGGGTGAATGTCTCCAGCCAGGCCTCCCAGCGTGCACTGACCAACCATACCGTCTACTG 

TTCCACCAAGGGTGCTTTAGACATGTTGACCAAGGTGATGGCCCTAGAGCTTGGGCCCCACAAGA 

TCCGTGTGAATGCAGTAAACCCCACAGTAGTGATGACACCCATGGGCCGGGCCAACTGGAGTGAC 

CCGCACAAAGCTAAGGTCATGCTGGATCGTATCCCACTTGGCAAGTTTGCTGAGGTGGAGAACGT 

GGTANACACCATCCTCTTCCTGCTGAGCAACCGAAGTAGCATGACTACTCGTTCCGCTTTGCCAG 

TGGATGGGGGCTTCCTGGCTACCTAAGCCCTCCCTACCAATACTCTGCTCAACTCATGTTCAGAA 

CATCGTGCCCTCCATCCCTCCAATAAAGCTCTCTGCCCAGCCTGTGTGCTGATTCTCCACCCCNN 

ANAAAAAAAATTCACANNAANNANAANAAAAAAAGTTTGGCGGCCGCAAGCTT 

GAGGGTTAAATTTTAGCTTGGCACTGGCCGNCGTTTTACAACGTCGTGACTGGGA 

TTACCCCACTTAATCGCCTTGCAGCANATCCCCCTTTCGCC 


RCT-185 




NCTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAA 

TCGGCACGAGGGAATTTAAGCTCCAGCAGACCAGCTGCCTGAAGAAGGACTGGAAAAAGCCGGAG 

TGTACAATCAAACCAAATGGGAGGAAGCGGAAATGCCTGGCCTGCATCAAACTGGACCCCAAGGG 

TAAAGTTCTAGGCCGGATGGTCCACTGCCCAATACTGAAGCAAGGGCCTCAGCAGGAGCCTCAGG 

AATCCCAGTGCAGTAAGATAGCACAGGCCGGCGAGGACTCCCGCATCTACTTCTTCCCTGGGCAG 

TTTGCCTTCTCAGGGCTCTACAATCCAAATAAGCCCTGGACAGGGTTTCATCTTACTTC 

AGCCGTGGCGGTACCCACCATATGGCCTCCCAAAGACTTTCAACTCCAGGCTAATAAAACTGTTC 

CTTTCCAAAAAAAAAAAAAAAAAAAAAAAAAGTGTGGCGGCCGCAAGCTTATTCCCTTTAGTGAG 

GGTTAATTTTAGCTTGGCACTGGCCGTCGTTTTACAACGTCGTGACTGGGAA 

CCCAACTTAATCGCCTTGCAGCACATCCCCCCTTTCG 

NCCCCCACCNNTN 


RCT-192 




CTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATT 
CGGCACGAGGGTTTCCGCTTGCTGCTCCGCCATGGCTCGCGGTCCCAAGAAACACCTGAAGCGTG 
TGGCGGCCCCACGTCACTGGATGCTGGACAAACTGACCGGCGTGTTCGCGCCCCGGCCATCTGCC 
GGCCCGCACCGCCTGCGGGAGTGCCTGCCGCTCGCCATCTTCCTGAGGAATAGGCTCAAGTACGC 
TCTGACCGGCGATGAGGTGAAGAAGATCTGCATGCAGCGCCTCATTAAGGTCGACGGCAAGGTCA 
GWUVCCGATGTGGCCTACCCAGCTGGCTTCATGGATGTCATCAGCATAGACAAGAGCGGTGAGAAC 
TTCCGCCTGGTCTACGACACCAAGGGCCGCTTCGCGGTGCACCGCATCACGCCCGAGGAGGCCAA 
GTACAAGCTGTGCAAGGTGAGGAAGGTCTTCGTGGGTACCAAGGGCATCCCGCACCTCGTGACGC 
ACGACGCGCGAACCATCCGCTACCCTGACCGCTCATCAAGGTCAACGACACCGTGCAGATCTCGC 
TAGACAGCGGCAAAATCACCGATGCCATCAAGTTTGATACCGGCAACCTGTGCATGGTAACCCGG 
AAGGGGCCAACCTGGGCCGCATCGGC 
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RCT-193 




TATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC 

GGCACGAGGTTTTTTTTTTAACTTTTAAAGAAAATACTTTATTACATCATC 

CAACTAGATTCATACTTGCTTGAATCTATAAAAACAAACAAACAAACAAAAAACTGAAAGTTTAT 

TCATTAGACTGTATGTGGGGTCATGTTCCACATGGGAACAGAGAGGCACAAGGGCTTCTAAGTAT 

TGCACAGTCTTGAAAAAAAAAAAAAGGAGTTGGGAGGAGAAGATCACATGATACTGGGAACGTCT 

CACATTATGAGAAACTACCAAGAAACATTCGAAAAGAAAACCCTCTC^ 

TCTGCAGTTCTTGGAATGACTATTCCATTGAAGACATCTTAGTAACAGGAAGCTTC 

ATCC C ATGTGCAAATATTAATAGGAAAATATATAAAATAAAAAAC CTTTGCGG CCG C AAGCTTAT 

TCCCTTTAGTGAGGGTTAATTTTAGCTTGGCACTGGNCGTCGTTTTACAACGTC 

AACCCTGGCGTTACCCAACTTAATCCGCCTTGCAGCACATCCCCCTTTCGCCAGCTGGCGTAATA 

GCGAAAGAGGCCCCCACCGATCGCCCTTCCAACAGTTGCGCAACCTGAATGGCGAATGGGACGCG 

NCCTGTANCGGCGCAATTAAACNCGGNGGGGTGNGGNG 


RCT-194 




TATGACATGATTACGAATTNAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC 
GGCACGAGGGTACCAGCGTAAGCCACGACAAGCTTCCTAAGGTTCAGTGTTATGACCAGTGTGAA 
AACAGATGGACAGTTCCAGCCACGTGTCCCCAGCCCTGGCGTTACACAGCCGCAGCTGTGCTGGG 
GAACCAGATTTTTATCATGGGTGGGGATACAGAATTCTCAGCCTGCTCTC 

GTGAAACTTACCAGTGGACCAAGGTAGGAGACGTGACAGCCAAGCGCATGAGCTGCCATGCCGTG 

GCCTCCGGGAACAAGCTTTACGTGGTTGGAGGATACTTCGGCATTCAGCGCTGCAAGACGTTGGA 

CTGTTACGATCCGACTTTAGATGTGTGGAACAGCATAACCACGGTTCCCTACTCTCTGATCCCTA 

CCGCGTTCG?TCAGCACCTGGAAACACCTGCCTTCCTAATGCAGAGCAAACCAAGGAGAGCACGAG 

TGAGCTCACTCTGACACACACGAGATGTCGTTTCTGCTCTGAAGAAGGCAAGTTTAATGAAGAGA 

AAGAAAAAAAAAAAAAAGTGAGCGGCCGCAAGCTTATTCCCTTTAATGAN^ 

GGCACTACCCGTCGTTTTACAACGTCGTNGACTGGGAAAACCCTGGCGTTACCCAACTTAATCGC 

CTTTGCA 


RCT-196 




TTTTTTTTTTTAAAAACTGAATAATCATG 

ACAATAAAACGTCTCCATAACTAAGGAGTGATATGCCATGTATTTTC 
TATTTTATTAATGTCATTTGCTGTTCAAGTAACACCTACTGCTTTTTCTC 

AACATAAAGACAGGAAAAAGCTACTACCCCCAACAGGAAGTCAAGGGACAATTGGGCGTTTGTTC 
TTTTGTAGAGGCAGTCTCAACTCTTTACTTCCTTCCTC 

AGCGCTGGAGAAATCTCGGGTACAAGCACACGTCTCGGATGTGATACCTGTTCAGAATCCAGCTT 
AGAAATCGTTCCAAGCCCAAGCCATACCCTCCGTGAGGACATGTACCATATTTTCTCTG 


RCT-197 




TTCTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAA 

TTCGGCACGAGGGGAGGTGCCTTCTGTGTTCTAGCCTTGGTGATCTTAGCAACTTTAAG 

AACAATATCTGATTTTAAGAATGTAGCAGTGTGGGAAGATGGTCATGGACCTTTAGATGTCTCAG 

GAACTGAAAGTTCAGAGACATCTAAACCACCACGTCTCACACCACCATGATCCTGATGAACTCAA 

CGGCTGCGATGAACTCAACTGCTGCACCCATTCGTTCCCAGCAAATAGGAGAGAAATTAATTGCA 

GTTACTAATAACATGACTGTTCCAGAAAAGCCCCCCCCTTTGGGAAAGTTTTGTTO 

C AG AATAGTAGTGACTC TTAG AAAGATC ATGG ATAAGTTC C AACAAGTTG AGC AAATTTATC AAG 

AGTTAACTAGAAGGAAAAGAGAAACTAACATTGAGCAAGAAACGAAAGAAATTATAGAATGGTTA 

CAAAAGTTTCCTTTTTATTCTGAGGGCCCATAGAGTTTAAACT 

TAAATGTATATCTGGGTACCCACAAGTCTGGTAGTATAACTGCAGNTTTCTAAACTATTGTTTGC 
GGCTGAGAA 


RCT-198 




TTNACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTCG 

GCACGAGGGGAAGGAGAAGGAGCTGGTGGTGGCCGAGACAGTGGAAGAAGTGAAAAAAGCACCTG 

TTTTGGTGTGTCCACCCTTACGAAGCCGAGCATACACACCACCCAGTGATCTCCAGAGTCGCTTG 

GAATCTCATATTAAAGAAGTTCTTGGGTTCATCTCTTCCTAATAATTGGCAAGATATCTC 

ATG ATGG AC ATGTG AAGTTC AG ACTC CTAGCAAATTTAGCTGATGACTTAGG CC ATGC AGTAC CT 

AACTCCAGGCTTCACCAAATGTGCAGGGTCAGAGATGTTCTTGATTTCTATAATGTTCCTGTTCA 

AGACAGATCTAAATTTGATGAACTCATTGCTAGTAATTTACCTCCCAATTTGAAAATC 

ATTACTGAGCAGTCCAGTCAGAACACAGTGAGATCATTCTCATTCTTCTCATTGGGTGACTGACA 

GCGAACTTTGTGAGATGTTACCTATTAGAACTTGGTTCAGAACTTCCTTTT^ 

CCTTGGAGAAGACACATTTTTTTTTCTCTCTGGAGCATCCACAAAGAAAACATTATCACATTT^ 

TAAAGCTATTTATCCCCAATAAAATCAAGTCTTGGTAATTATGAAAACATTCTTATTCCTGGTAT 

ATAGTCAGGGTTGTTGAGAGGACANAAAGTGGTAACATGN 
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RCT-205 




CAACNNCNNNCNCNNTTATC 

TCGAGGCCAAGAATTCGGCACGAGGTGTGTCTGTAACTGGATCAGCAGCATGTGTTCCAGTCATG 

GTTGACCTTAGCAGACATTAGCAGGATTAACACTGGAGGGAAACAAAGAAGCAAAATAAATATTT 

CCTCTGAGGTGGTATCTGCTGCTTGAACTGTTGTGCATTCATTGAACTTTC 

CCTT ATTG C ATCTGAAATTCTTGGTC TAAAGTGAGATCC G AATTTCTTTTGCGC AC CTTC ACGAA 

AAAGTATAGAAGCCATATAGTTGGAACTTTGTGTGGGTTATGTCTAAACCTAGGAATAAATGCTG 

GGTTTCTTTTCCTGTTGTAATTTCAGTCAAGTTO^ 

TAAATTACAGCAAAGCGCTATCCTTCTGATCACAGCCAAGCTAACATCTATCTTCCGAACATCAT 

GTCTGCCGCCTGCTGAAAGGCTGTAANGGTCTGGGTAGTTTTCATTTAATATTGATCA 

GTTTATTTATAAGNGCAAAGTGTTTTTTGGACGTTAA 

AAAAAAAAAAAAAAAAAAAAATGGAAGCGGGCCGCAAGCTTATTNCCTTTANTGANGGT^ 
TAACTTGGCACTGGCCGGCGNTTTACAACGTCNNGAATGGGGAAAACCTGGNGG 


RCT-206 




NNNTCTCTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCA 

AGAATTCGGCACGAGGCAGATAAATGCAGGCCAGAAAGGCTGCTGCCGCCGCCGCCGCCACCACC 

ACCACCACCGCCTCCTGGATTGTCTG^CATGGGTCGCACTGCTGTGGCAGACGTCTGGACTTGAG 

CAGAGGGAATAACCTGACTTACTTGCACTGTGATCCCCCTTGCTCCGCCCACTGTGACCTTGAAC 

CCCATGCACTGTGACCTCTGCCTCCCCCCCTTCCCACTGTGATTGGCATGTTGACAAGGGCTGTC 

CCAAGTCAATAGAAAGGGAAAGGGTGGGGATTAGGGGAGGATTAGGGGAACCTACCAAGGACTCA 

GAGTAGAGGGTCAGACAGTGCCACTTGGCCGCTTGGGGTAAAGCCAGTGCCAGCAATAACAGTTT 

ATCATGCTCATTAATTTGGGATTTCAAAACACAAATGAGAACTCCCCCACCCACCCCAAGTGCAT 

GTCGCCATCACTTAAAGTAAGTOCCATTGAAAATATCCTTACTTTTTTTTTCTTC 

TTGTTTAATACAAATACCTGATTTGCAGAAAAAAAAAAAAAAAAAAAAAACNATT^ 

GCTTATTTCCTTTTAATGAGGGTTAATTTTANCTTGGN 


RCT-207 




TTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATT 
CGGCACGAGGTCATGTGCGATGCTCTCATCAAGGCCATTGGCACGGAGCCTGACTCAGACGTCTT 
CTCGGAAATAATGCACTCCTTTGCTAACNGCATTGAANNGANGGGAGATGGGNGTCTCAACAATG 
AACACTTCNAGGAACTGGGAGGTATACTGAAGGCGAAGCTCGAGGAACATTTCAAAAATCAANAG 
TTGCGGCAAGTTAAAAGACAAGATGAAGACTACGACGAACAGGTTGAANAGTCNCTACAAGATGA 
ANATGATAATGATGTTTATATACTGACTAAAGTCTCANATATTTTACACTCAATATTCAGTANCT 
ACAAANAAAAGGTGTTGCCGNGGTTTGAACAGCTGCTCCCATTAATTG 

CATAAACCCTGGCCANACANACAATGGGGATTGNGCATCTTCGATGATATCATANANCACTGTAG 
TCCAGCTTCATTTAAGTATGCAGAATATTTCTTAAGGCCAATGCTCCAGTATGTATGTGACAACA 
NCCCGGAAGTCAGGCAAGCTGCAGCATATGGCCTTGGCGTCATGGCNCAATACGGNGGANATAAC 
TACCGCC 


RCT-211 




TATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC 
GGCACGAGGAAGAAGCAGCGAGGACGACCCCCGGACGGACCAAACCCGTGCGCCGCTGTAAACCC 
GTGCTCAGCGCCGACGTCCCCTGCCGCCGCCATGCCCAAAAGAAATGCTGAAGGGGATGCTAAAG 
GAGACAAAGCCAAGGTGAAGGACGAGCCACAGAGAAGATCTGCAAGGTTGTCTGCTAAACCTGCT 
CCTCCAAAGCCAGAGCCCAAGCCTAAAAAGGCCCCTGCAAAGAAGGGAGAGAAGGTACCCAAGGG 
GAAGAAGGGGAAAGCGGATGCTGGTAAGGATGCGAATAATCCTGCAGAAAACGGAGATGCCAAAA 
CAGACCAGGCACAGAAAGCGGACGGTGCTGGAGATGCCAAGTGACGTGTGTGCGTTTTTGATAAC 
TGTGTACTTCTGGTGACTGTACAGTTTGAAATACTATTTTOT 

AAAAAANAAAAAAAAAACAATTGCGGCCGCAAGCTTATTCCCTTTAGTGAGGGTTA^ 
TGGCACTGGCCGTCGTTTTACAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAACTTAATCGC 
CTTGCAGCACATCCCCCTTTCCCAGCTGGNGTAATAACGAANAAGNCCGCACCGATCGCCCTTCC 
AACAGTTTGCGC 


RCT-212 




ATGTCATTGTACCTGCTTCAAGGGGCNTGGTAGTTAGGGACAGCGGTGTGTAATGT 

TTCTCAAAGGGGAATAGGCTGGTGTCCCGCTTTATGGTTCGGCATAGAGCTCTGTAGACTCAAGT 

TTCAGCTGTTCCAGGGCTTCCTGCTGGGCTTCCAGCATGGCTCTGATGGCGTC 

ATGCTCCTGCTGCTTATACAGTGCCCACCTCTTCAGCAGAAGAGCTCTCCGCTCACTCTTCCTCA 

AAAGAATGCTCCTCCTGAGGTCGCTGTCTTGATTTATCCAAGAACCTCACAGGGGTAATAAAATC 

TTCAATAGGAACCAGTTCTTGGGAGGCTCTTTCCAGTTTTCGGA 

TTGCTCCTTGATCCTTTCTAGGGTCTACCTTCTTCTTCTTCCGTAGAGGT^ 

ATGAGCTCCCAGAAGGACAGCAAGGAAGCTCTCTGGTGGGTGTGTCTGACCTGTGCCTGGCAGGT 

TCCTGGGATCCAGCTCCGAGGTCGCAGAGCCCGCGCAGCACACAGCATCGCAGCGGTAGCCATAG 

C 
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RCT-214 




TATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC 
GGCACGAGGGTTTCTTCTGAGCTGCATGCCCCTTGCTGGTTAGATGACAGCATACTTGACTC 
CTATGATATTGGAGGTGACAACAACAGGAGACATTTCTCCCAGAATAGCATTGGACTTCTCAGGC 
AGACTCAC^GTCTACTGTTTTGGTCTTTCTCTCTCCTCCCCCTACTTCTC 

ATCAGAAAGATAATACTAAAGTGAAAGCTTTGTTTAAGGTCTTAAAAATTGAAGAAAATCAGA^ 
TTGTAAAGACAGTAAGACTTCAGACATACATTTTATAAGATCACAGTACAATAGTTAGAAGTACT 
GATGAGTGTATTCCCAATCCCTGGTCCCTAAGGCTAAATCCACTGCTTGTTCCTTGCTCCCTCGT 
ATACTCTCAAGGTCTCTTTCAAAGATGGTTGCAGTGTTTGTCTCCATTC 

TTCCATTTAAAAAAAAAAAAAAAAAAAAGTGAGCGGCCGCAAGCTTATTCCCTTTAGTGAGGG^ 
AATTTTAGCTTCGCACTGGNCGTCGTTTTACAACGTCGTGACTGGGAAAAACC 
ACTTAATCGCCTTGCAGCACATNCCCTTTCGCCAGCTGGNGGTAATAACGAANAGGNCCCCACCG 
ATCC 


RCT-215 




ATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGATCCCAAGAATTCG 

GCACGAGGGTAATTTTGGAGGTTTCCCCACAGCAAGTCACTCTCCTTTTC 

GTGGAAGTGCTGGATCAGTAAATGCTAATTTTGCTCATTTTGATAACTTC 

GCTGATTTTGGATCCTTCAGTACATCCCAGAGTCATCAGACAGCATCAACTGTTAGTAAAGTTTC 

AACAAACAAAGCTGGTTTACAGACAGCAGACAAATATGCGGCACTTGCTAATTTAGACAATATCT 

TCAGTGCTGGGCAAGGAGGTGATCAAGGGAGTGGTTTTGGGACCACCGGTAAAGCTCCTGTTGGT 

TCTGTGGTTTCAGTTCCCAGTCATTCAAGTGCATCTTCTGACAAGTATGCAGCCCTGGCAGAGT^ 

AGACAGCGTGTTCAGTTCTGCAGCCACCTCCAATAATGCGTACACATCCACCAGTAATGCTAGCA 

GCAGTGTCTTTGGAACAGTGCCTGTGGGTGCCTCTCCTCAGACACAGCCTGCTTCAAGT^ 

GCTCCATTTGGAGCTACGCCTTCTACGAATCCATTTGTTGCTGCTACTGGTCCCGTCTGCANCGT 

CATCTACAAATTCATTTCAGACCAATGCCAAAAGGANCAACAGCGGCAACCTT 


RCT-22 




TCTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAAT 
TCGGCACGAGGTCGGGCTACATGTGGGAAATTTGCCCTAACTACCGAGTCAGGAGTGATTGGCTC 
TGAGTAAGGCCCAGAAGCTCCCTTGGGTCCCAAACCCCAGGCACTGGCTGCCTCTTGGTCCTGCT 
GACTCTTCTCTCCTAACCCCAGCCACTTAATTTTCTCTGTTGTTCCCTCGAACACACGGAAGCTG 
TTGATGAATCCTTTTCTTTGCTGTGCCAAGGCAGG 

TGTCCCGAGGAGCCAGCTGTCCTTCCTCCCTCTTTAGACCTCCACAGGGACAGACCTGATTTATT 

TATTTTGGTTTAAAAAAAAAAAAAAAAAAAAAAAACNTTGCGGCCGCAAGCOT 

GAGGGTTAATTTTAGCTTGGCACTGGCCGTCGTTTTACAACGTCGTGACTC 

TACCCAACTTAATCGCCTTGCACACATCCCCCTTTCGCCAGCTGGCGTAATAACCGAAAAAGCCC 

GCACCGATCGCCCTTCCCAACAGTTGCGCAACCTGAATGGCGAAATGGGACGCGCCCTGTACGGC 

CATTAANCCCGGCGGNTGTGGTGGTTTNCCCCCACCNGTGAC 


RCT-220 




CTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATT 

CGGCACGAGGGTGTAGTACCTGTAGGGAGTACGGCCTCTCAGAACACTCAGGTTCTTTATAAACT 

TTGTCTTGGTTTTAAGAGAAAAGGAATGTCAGTGTAATGCTCTGGAGGCAG 

CTGGGGGTTCAAGGACAGCTTGGTCTTCATAGCAAGTTCCAAGCTGTACAGGGCTACGCTGTAAG 

ACCTGACCCAAAAACAGCAAACAAGAAGGAAGGAAGAGAAAATAGTATCTAGAGATGGAACCAAC 

TGATGCAGCAGCAGTGGCGTGGGGTTTCCAGACTCAGAAATTTCTTCTTTTCTAATTC 

CATTTGGTTTCCATGCTAACCTTTCCCCTGACACAGACTTAAAAGATCTGCAACAAGGGGAGGCG 

CTTTCTCTTTAGAATGTAGAGAGGAGAGGAATTTGTTTTTATTTTAA^ 

CTGACTGCTGAGACTTCCCTAGCATTCCTTTAAAGTATTTTGTACAGAAGAGAAGAACCCTCCTG 
GAGCGGCCCAGGTAGGTAAGTCTGTGCTGTACACAGCACCTCTCTGCCTCTTCCACTGCTGTGTC 
ACCCT 


RCT-221 




TTATGACATGATTACGAATTTAATAGGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAA^ 
CGGCACGAGGCTAACCCTGTCCACGCTCCTGCCTGCAACCCTCTCCCTGCTTGGCACAGTCGAGG 
AGGAAGATGCTCTTTGCCTATCCCAGCTGCACCCTGGCTTCCTGCTCAAGGGAAGTGAGCACCCC 
ACTTCCTGTGCTAGTTAGTGCCTGATTCTCTGGGTGAGTCCCCGGGCGGACTCCCTCAGCCCC^ 
TCTCTGGTACAGTGGTGTCCGCCCGACTGCCTCCTGTAACCCCATCTTCTAAGCCATCAATTTTA 
TGTTACTATATTGCCCTTTGTGGGGTGGGAGAGGGATCTCCTGGCTCTGCGACTTGCCCCTTTGC 
CGAATAGTTACTGTTCTTGACTTGAAGAGAAGCAACGTGTGGGGACCTCCCCACTGCCCCAGCCC 
AGACTTCTTCGGAAGGGTTGGAAGTTGCTAGACAAATCAGAATGTAGAAGGTGGAGGATTCTGAG 
GAGGAGGCAGAGAATTCTGACTGGGGAGGTATANGTTGGGTCCTCTGCCTCCCACGGCTGCAANG 
TGTGTCTGACCTCTGGAGCTCAGCCCCTCCCCCCTTTCTCTTCAGTGCTGACAAGATGTCNATAA 
ACTTATTTTCATACAATTAAAAAAAAAAAA 
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RCT-228 




TATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC 

KCACGAGGGACTTAGTGATACTGTGAGCCTCCATGTGAGTGCTGGGAACCAAACCCTGGTCCTC 

TAGAGAGTAGC^^GTGCTCCTGACCACTGAACTATTTCCTGGCCTCCCCACAAATATTTTTAAG^ 

GATGTGATAACAGTTGTAGTCTGTTGCTCTTAGATGATCATGAAATGGTCCAGTGTTGAATACAC 

AAGTAGGTAAAAGGAGATAGAGTTAGAAGGGATGGGTGTGATGCGTGCCTGCAACCCCAGCACGT 

GGGAGACAGAGACTGAGACTCAGCAGTTACAGGCCAGGCTCCACGGGATGAGGAGCTCAAAGCCA 

GCCAAAAATAAATGGACTTTTAAAAGAAAGAAAAAGTACATATTTGGACAGAGAGAAAAAATGTT 

TTCCCATCTCTTGGATTTGCAAAGCATGGGTGTGTTTTGAGCCGTGAAGTACTGCAGGTC 

CGGAAGTGTTGCCTGTTCTGTCCCTCTCTGGGGCGCTTGGGTAANACAGGGCGTGGCTCCTGTTC 

TCAATTATTCCAGCTCTTTTGTGGCTTTTTTTT^^ 

ACCTCANATATCCTTTGAACTCNTGTGTACCCCAGGCTTCCTGCCTCCTGTCT 


RCT-237 




TTTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAAT 

TCGGCACGAGGCTTGGGGTCAGCAGCTGCCAATGACTGATGACTCCTCATGCCTTTGCCAAAACG 

TCACCCGTTCCTGGAGGGACAAGCCTGAGCCGCCATTCCCCCTTAATATTTATGAAGTGCCTGAG 

ACATCTGCTTGCCTTGCTCAGGAGTCCTGGCCTGTGCCGTGTAAGTGTCCTGCTCAGGGGTGGCC 

ATTGAGACTTCCGTCTCATGCCCAGCCCCTACTGTCCATGGGAGATTTGCACTTCACGTCTCTCT 

GCGCACTTTTCCCGTTGGWCACCCACGTGGCCCGCGTCTGTACTCCTAGGTTGCACTGTTTGGTC 

GTGTGTTGGATGGAACCTCGGGCCAGAGGTGGCCCCGAATCACTCACTACAGAAGACAATTGCCA 

GGCCCCAAGCACCCATTGCCTTCTCCGGTGCTGGGCAGATCTCAGGGCCTCGTTTGTT^ 

TGTTTGTTTGTTTGTTTGTTTACGTTAAACCTCTGCCTGGCA 

GTTCCGCTCCTCTGGTAAACAGTGACTCAACNCGTCTTTCTCACAAAAAAAAAAAAAAAAACACA 
TGCGGC 


RCT-24 




GTTAGNTTTNANATGATTACGAATTTAATACGACTCACTATATGGGAATTTGGCCCTC 

AGAATTCGGCACGAGGCTGATATTGAGCGCCCCACCTATACCAACCTCAATCGCCTCATCAGCCA 

GATTGTCTCCTCCATCACGGCCTCTCTCCGCTTTGATGGAGCCCTCAACGTGGACCTCACAGAGT 

TCCAGACCAACCTGGTACCCTACCCCCGAATCCACTTCCCGCTGGTCACTTACGCACGCATCGTC 

TCTGCCGAGAAAGCCTACCACGAGCAGCTGTCTGTGGCAGAGATAACCAGCTCTTGCTTCGAGCC 

CAACAGCCAGATGGTGAAATGTGACCCACGTCACGGCAAATACATGGCCTGCTGCATGCTCTACC 

GTGGTGATCTGGTACCCAAGGATGTGAATGTCGCCATTGCTGCCATCAAGACCAAGAGAACTATT 

CAGTTTGTTGACTGGTGTCCCACAGGCTTCAAGGTGGGCATTAACTACCAGCCACCTACTGTCGT 

GCCAGGAGGAGACCTGGCCAAAGTCCAGCGAGCAGTATGCATGCTGAGCAACACCACAGCAATCG 

CAGAGGCCTGGGCCCGCCTTTGACCATAAGTTCGACCTCATGTATGCCAAACGGGCCTTTGTACA 

TTGGTATGTTTGGAGANGGAATGGAAAAAAGGAGAATTTTCTC 

TCTGGANAAAG 


RCT-240 




TTTTTTTTTTTTTTTTTTTTTC 

CCAACCCCTCACCGTTACATTTTGTGTGGAGCATCAGTCGCGTGCCTGAGGGTCTTGCCTATAGA 
GTCTGTGGTCATCCTGTTGGCCAACAGGTATTCCTTTTGTTGGA 

CTGTGGTGTGATGGAGGTGTGAGTCCTGGATGTAAGTGCGAAGAGTCCACTGTGGAATGGTGGCT 
AACATCCACTTTAGCTAAAATCTCATAATACAGCAAATAAAACACTGGGGTTATTATGCCCACTA 
TCAACATTATGACGACAGCTGTCCACCAACCCATCCCCCAGTCTGCGCCGTAATATGGATCCTTT 
CGGTGAACGCTTTTGTTATCAGGCTCAAATCGGACCTGTTGTGCTGTTAAGGC 

rrvrw"i » ^r•rTYT»r»fTV^r^^^^r*^TVTV■'•^'^rv*'TV , '7Y7^ , 31f , Zk r , 'T w TY22if ir PZV'T w T v Pf^O'r , f ,, 'PA , PfiTC , f5C!G 


RCT-241 




TCOTNACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGC 

CGGCACGAGGGGAGCATGCTGACCTTATTAGTCAGGCTCGCCCTGATTGTAGTACCCCTGAGCTT 

GCCCTAGGGTAGACGCCAGTCATTTCACAGAGCCCACCTACTCAGAACTGCACAACACGGCTCTG 

GTCCTGGGATTTTGGTTTGCAGTGAGAAGAAAACATCTTCGTGTAAACTTGAC 

AGTTTAAGGGTGAAAACAAGTGGATCAGATGTGTGATGAATGCCACAAGAGACTGACT 

TTGTATAGATITATCACATTTAATGGAGAGGTTACAAACTAGACTTTG^ 

GAACTTGTAGTCACTGATAAAGTACAAAGAACTCTCTTCGTGTCTACACTAAACCTAACCAGAGA 
TGTAGAAATGGAGAGGAGGCTCTTCACAGGTCTCCCACTGTCTAGCTTATGGCAGTAAACCAGGA 
GAGCAAAGAGCTCCACGCTTACTGTCCAGGTGACTCTACCCTGTTCAACAGACTTCATTCTTCTT 
TAGTTGTTGACTACCTCAGAAGAGAAGAAAAGGACATTTCACAGAGTGATGCCTTTTACACATAA 
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RCT-242 


1 

< 
r 

r 


'ATTATGACATCATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCC^GAAl 
rTCG^^CGAGGGACCGAGAGCTGTGGGTTGAAGCAAAGCTGTGAATC 1 
rGTTCTCTCCACACACAGGTCCCCGCCTTTTTAGAAGCAGCCTCCTGGTCTCATC 

PTCOTCACTGCCCGTGTTCACTTTANAAA 1 

Stc^^aaItaaaaataagtaagtttgtaagctattccgacagaanagacaa^ 

roAOTGTACAATA^GCTTTTATATGGAAGACTGTACAGCTTTATGGACAAATC 
roTTTTTAATAAAAATCTAGCAGACCCCAANCNTAAAAANAAAATAAAAATC 1 
^TTATrCCCTTTAGTGAGGGTTAATTOT 

rQ^AAAACCCTGGCG^A^CAACTTAATCGCC 1 

SSSgagIcccgcacccgat^cccttccaacaagttgcncacctga^^ 


RCT-244 




cttctatgacatgattacgaatttaatacgactcactatagggaatttggccctcgagg^^^i 
atcg^cga^gcgWtt^ 

actoatcaggatgtgcatctgaacttcacgcgggaagagtgggctttc 

SctctaSgatc^ 

amacSaaatatcaggaacactc^ 

agctc ataca^tSctctgaagaaatittaatgtct^ 

OTGTAAATAAGAACTGCTraAA 
ATCTGC^ATCTTA^^ 

, ____ - m. .inu , im i-n ■» t\ « tv y" , ^ , «mo/* , T*^inW , 'TY , 7V JV r P2^f"IT w TY2 1 


RCT-245 




tctttatgacatgattacgaatttaatacgactcactatagggaatttg 
attcggcacgaggtittttttttgtttatttct^ I 

nnTCCGTCTGATGGGGW 

TC^TAT^ACACACTAAAGAAACTAGAGAGAAGGCCAGATTTGGGATAGAGTTGCCTGCCT^ 

atcacaSwagag^tm^ccactgctcaaggtgctgg^ 
cccagS^atcagactcaggcagaagaaccttgagctacacat^ 
accacttg^agatatoatctcccaagagacttgggcag 
agtocatcctcaactggangccaacctgagccitacctaaaaacaaagaa^ 


RCT-246 




tttatnacatgattacgaatttaatacgactcactatagggaatttggccctcgaggccaag^ 
Scmgaggca^ctcctgaaatgtotaataagctgcsgttagctc^ 

tgtag^aaaccgtccctggtcttgaatggtctatgtg 

aatcgtS^aa^gaatgtcccattgctaatgctagtagcac^^ 

gt^ggcggagaggggagatgcatcctcctttttcctggagcctgggaaagcgggtca 

^^tg^tcata^acttgaaaatgat^^ 
(I^SSS^g^ 

cctttaatgaggggtaattttagcttggcactg 


RCT-251 




CNNNNNNNNNNNNTCTATGACATGATTACGAATTTAATACGACTCACT^ 

r^TCGCAmTCTCAATTGTTGCTCCCCTTTCCCAGAAAACCCTCCTTGGGTCAAGGTGACAAAA 

a^ScSotgcc^a^cattgctatggtgtagaacagtatctggcacagaat^^ 

GACGGCGTTATCAGCTGTATCTGCTTTAGAATCACCTGGAATCTCCATGAAGGTA^ 

AGGGCAGCCTCCGAAGATGTTTCTGGGCATTATTTTGAT^ 

GCTOA^CTGCTCAlGGGTCACAGTCGGGAACTGTGCTTCTTATACCTGCTCCC 

AOTATCCAAAGGCCTTTTM^^ 
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RCT-252 




1 AlAjAUArwii. 1A<-*jAA1 1 1 AA1 AL-ijAl^ IvJAv-. 1 Al AvjVj^AA I i i I CGAGGUwAAGAATTC 

GGCACGAGGGGCCAATACTCTACTGGCAAGACCACCTTCATCAGGTACCTGCTGGAACAGGATTT 
TCCAGGCATGAGGATTGGGCCTGAGCCAACCACTGATTCCTTCATAGCAGTGATGCAGGGAGATG 
TGGAGGGGATCATCCCTGGGAACGCCCTGGTGGTGGATCCGAAGAAACCCTTCAGAAAGCTCAAC 
GCCTTCGGCAATGCCTTCTTGAACAGGTTTGTC^ 

TATCAGTGTCATCGACACACCGGGGATCCTCTCTGGTGAGAAACAGAGGATCAGCCGAGGGTATG 

ATTTTGCTGCTGTCCTCGAATGGTTTGCTGAGCGGGTGGACCGAATCATCCTGCTC 

CACAAGCTTGACATCTCTGATGAGTTCTCAGAAGTCATCAAGGCCCTCAAGAACCACGAGGATGC 

AGGATCAGCTGCAGGCCCAGGACTTCAGCAAATTCCAACCACTGAAGAGCAAGCTGCTAGAAGTG 

GTTGATGACATGCTGGCCCATGACATTGCCCAGCTCATGGTGCTGGTACGCCAGGAAAAGACCCA 

CGGCCTGTTCAGATGGTGAANGGCGGAGCATTTTGANGGAAC^ 

GGCTTT 


RCT-256 




TCTCNTaANATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAA 

TTCGGCACGAGGAATTGAGTGACATATCACTCCTGAGTATGCCCACTAGATGCGGTGGAGATGCA 

GAGGCATCCGGACCCCACGCCCCACCCCCTCCCCTCACACACTTACTCTCTGCCTAGTAATGCCA 

CAGAGCTTCCATCCCCATCCAAAGGTCATCAGGCATG^CTATCAGTTGGCTCTCAGGGTGGATTT 

GACATTCTCAGATGATTAGAAGTTGGCaAGAAGCAACCTTGGTGAATAACTCTGGTGTCTAAACT 

CTGTACTTGAGTTACAGTCTCAGTAGAGGAGACGCCAAAGCTGTTGCGAGTGACGGCAGGATTAT 

TGAACAGTCATGATGCTTGGCTTTCAAAGGCGATTATCGCTTTAAGGTCTTAGAATO 

CATCTTTATAACCAGGCATAGCTAGATCATAAACTACTGATGGCCAAGGACCATAGAACGTGCTT 

CTTACCTTCCTCTCTAGTTAGCATTACGACAAACATAATCACCAACGCTCAGGGAAACACTTGCT 

GATTCAAGTAAAATGCATGAACCTTGGAAGACCTTTCTAGAAGTCAGAGATCAAGTTCA 


RCT-258 




TNATGACATGATTACGAATTTAATACGAC 

CGGCACGAGGACCAGTCAGGGAAGAATGTGATGGTGGAGCCCCATCGGCATGAAGGAGTCTTTAT 

CTGTCGCGGAAAGGAGGATGCCCTTGTCACAAAGAATCTGGTTCCTGGAGAATCTGTGT^ 

AGAAGAGAGTCTCTATCTCCGAAGGAGATGACAAAATTGAGTACCGAGCCTGGAACCCCTTCCGC 

TCCJVAGCTGGCCGCAGCAATCCTGGGTGGCGTAGACCAGATCCACATCAAGCCGGGGGCCAAGGT 

GCTCTACCTTGGGGCAGCCTCAGGCACCACCGTCTCCCACGTGTCTGACATTGTTGGCCCGGATG 

GTCTGGTCTACGCAGTTGAGTTCTCCCACCGCTCTGGCCGTGACCTCATCAACTTGGCCAAGAAG 

AGGACCAACATTATTCCTGTAATTGAAGATC^ 

w iVHl ^, 1 1 lij^CuAlu 1 Vjv^^C ALrUL.AvjAL.l~ AAAl_L.V_OAA 1 IvjI\*VjCLvVi\iAATLi 

CCCACACCTTCCTGCGGAATGGAJSIGACACTTTGTGATTTCCATTAANGGC 
CACJVGCCTCAACANAANCTGTGTTTGCATCTGAAGTGAAAA 


RCT-260 




GGGGGTGAACATACAAGAAGGTTGNTGTCCTTTC 

GNGAGTACACGAGTTTTCTCTAACCAGTCACCACACTTCTGAAATAACGCGTGCTAACATTC 

TGATAAAGGGACCGTCCCCTTGGGTAAAGTGTCAAGCAGGGTTAAATATGTATAATAGACAAGCA 

CCATGAGGAATCTGCTCCTGCTCGATGGGTC 

GCAAGTTGATTACATGGTTTTGGCTGACTC 

TTAAAAAACATTCTCATGAATGATTTATCTACAGTACGGTTTCTAATACACAACG 
ATTGCTGAAACTGGTGGTACTTAAGTGTCTCCTTTC 

CAGTCCACCAACTCTTTCAAACCTAAAGTCTCCTGTCACAGATGACAGGATGCAGAAGAGACCTG 
CTGGGATCGGCTTTTGCAACCTGTGCTGCAGCCTTCGCCCTCCTTGGGTGTGAAGTTC 


RCT-264 




CTANNCCNTATGACATGATTACGAATTTAATACGACTC 

AAGAATTCGGCACGAGGCATCTCGAGCAGAGAGCCTGTTTCCAGCACCCGTTCTCCTACCACCAA 

GTCACACTCGTGGCTGGCAC^GAGAAGGTGCTCACTGTTTGCTTGTTCAGACCGCGTC 

CATGGAAAACTCATTGAGTTTAATATTCTGGGCTTAACAGACTGATTATTTTCAGG 

AAGGATATTCGTAAGTATGCTGAAGTGACAGGTGGAGAACAATTCCCATTAATTATTATGTCTAA 

TTATTCCACTTAATAATGATGAGATGCAAATAAGACCAACCAATAAAAAATGAGGAAAATACACA 

AGTATAATGTATAGAAAAAGCACyiAAGTATTACCATTCCTTCAGCTTCGAACAAGACCATGATCA 

ACATCAAAGGACAACCTATAGCCCAAGACATGTGCCTGTCTGCACTCCAGGCTTGCTTACCTTGC 

TACGGATGATGAGAGGGAGTGGCAATAAAACCAAAACAGTGGAAAACCACAAGGAGAAAGCGACG 

ATACACCAAAAGCGTAATTGAGGAGCTTCATGCCTGAGCAGGTGCTTCAACANTTCCCCCTC 

CTCAGGCAGAAGTTAATaACCAGCTGGGATTAATATTTCTCTACCTCATCATCTTTACCTACTG^ 

CTCAGANAGAACCAACGCTGGTTAAAAATAAATCTCATTTTTATTGGTTTN 
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RCT-268 




kTGACATGATTACGAATTTAATACGACTCACTATAGGG 1 
3CACGAGGGGCTGTTCTCCAGGAAAGATAAGGTTGACTTTGTCCAAAGAATGCTCTTCAACCAAC 

^GGAGCTCTTTGG AAGG AATGGATTTG AATATAGGATGTTCC AG ATGTTTGAATC CTC AC ATAAG 

3ACCTGCTTTTCAGTGATGACACAGAATGCTTGTCTAACCTTCAGAACAAAGCAACCTATAAAA 

ATACCTAGGGCCACAGTATCTTACCCTGATGGACAACTTTAGACAGTGCTTGTCCTCAGAACTGC 

TGGATGCCTGTACATTTCACAAACATTAAACCATTCAGCTGGGGGGGTAACCCAAGGACTGGGGA 

CCCAACTCCTTGCCTCCATGTGGACCTGTGCCGACATCTCTATAACTCAGTGTCATAGTGGAAAC 

AAAC AAAAAAATG AAT AAAAGT ATCTTCC C AG AAAAAAAAAAAAAAAAAAC AC ATGCGGC C GC AA 

GCTTATTCCCTTTAGTGAGGGTTAATTTTAGCTTGGCACTGGCCGTCGTTTT 

TGGGAAAACCCTGGCGTTACCCAACTTAATCGCCTTGCAGCACATCCCCCTTTCCCAGCTGGCGT 

AATANCGAAAAGC C CGC ACCGATCGCCCTTCC AAC AGXl\^ Ct- ^ AC v- i.\»varv* ww\ x ow** 


RCT-271 




CTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAAT^ 

CGGCACGAGGCACAITTCTTCAGTCATCATGGAGAAGGTGGGCACTACCACTACGATACCACCCC 

AGACACAGTGGAGTATCTCGGCTACTTCTCGCCTGTCACAGTTCCTCTACCGAATTGACCAGCCCA 

AGGAGACCCATGCCTTTGGGAGGGATTAAAGCAGCTGCTGCTTGATTAGGGGAAAAATGGOT 

TAAGGTTATATTGCTAACCCAGCAATTGACAGTAATTAAAAAATATAGTAGGACCAAAGAGAGAG 

AGAGAGAGACCGCAGCCAACAACAGTCATTTCACTACATTAGCACTGTTTTTATGGCATCCATC 

GGATAAGGGACACTCGTGTTATTTAGCATCTTTATGTGAAGGAGTCTTGGGAACAA^ 

ATTGAGATGCTCACTTTACTCAACCAATTTTCTGTCTTTA 

CTTCACACCTCAGGTATGGGCATGGACAGGGACCTTTCATTATCCCTGCTATCGAGTAACATTTG 
GGTCACATTGCTGAATTTTTCAAGTTCATCTCACAAGGCTAAGGGGTC 


RCT-274 




TATGACATGATTACGAATTTAATACGACTCACTATAGC^ 

GGCACGAGGCGGCAGCAGAGCCAAGTCCTAGATGCTATGCAGGACAGCTTCACTCGGGCGTCTGG 
CATCATAGATACGCTTTTCCAGGACCGGTTCTTCACCCATGAGCCCCAGGACATCCACCATTTCT 
CCCCCATGGGCTTCCCACACAAGCGGCCTCATTTCTTGTACCCCAAGTCCCGCTTGGTCCGCAGC 

CTCATGCCTCTCTCCCACTACGGGCCTCTGAGC 

GATACACCAGGCTCAACAGGCCATGGACGTCCAGCTCCATAGCCCAGCTTTACAGTTCCCGGA 

TGGATTTCTTAAAAGAAGGTGAAGATGACCGCACAGTGTGCAAGGAGATCCGCCATAACTCCACA 

GGATGCCTGAAGATGAAGGGCCAGTGTGAGAAGTGCCAAGAGATCTTGTCTGTGGACTGTTCGAC 

CAACAATCCTGCCCAGGCTAACCTGCGCCAGGAGCTAAACGACTCGCTNCAGGTGGCTGAGAGCT 

GACCCAGCAATACAACGAGCTGCTTCATTCCTCCAGTCCCAGATGCTCAACACCTCATCCTGCTG 

GAACAGCTGAACGACCAGTTCAGCTGGGTGTCCAGCTGGCTAACCTCACACAe^CUAi\jAU^A^ 


RCT-276 




ANNNNNGNNNCCNCTATGACATGATTACGAATTTAATACGACTCACTATAGGGa^ 
GAGGCCaAGAATTCGGCACGAGGCTGACCCGTGTGCAGAGGCATTTTCGTTCCCCTTTG 

TTTCTACCTACACGTACTATTTACC^ 

CTATGGAAACAATGAAGAGAAACGGGGGTTTCAGAAGAAAATTGTAACCAAAl^ 
TATAAGTTTTTGATATCATGATC^CAGGTGATTCA 

GCCTGAAGTGATGTCCATGGAACCCATCGTCTTTGTACAGCGTATGTAGATGGCAATCAT 
ACTTTTGACTGGTCAGAAAAAAAAAACTAATTGTGATTTC 

AGATGATGTGACCTCTAATATTTATCTAATAAATATGTATTCAGATGAAACCTGTAAAAAACAAA 
AGTTGCNTAACANAAAAAAAAAAAGGGNGNGCGGCCGCAA 

TTTTANCTTGGCACTGGCCGTCGTTTTACAACGTCGTGACTGGGAAACCCTGGCGT 


RCT-277 




TTCTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAA 

TTCGGCACGAGGCTGAGAATCTGTATTGTGGTGTATAAAGTGTCTTCCTAAGAGCAG 

CAAATTAAGCAGACTTTCTTTTCAAGCTTATGACTT^ 

TTTAAAGATGAGATTCTAAGCCTAGAATTTTAAACCACATTTC 

AAATTTTCTACAATTTGGTTTTTTTAGACTTAAAAACA 

TCATCACATCCAGTGGCAGAGGGGACTGTCTGAGAGTAAAGTCCCATGATTCAGAAGGACATAGC 

AGGTCAGGATGTACCAATAAAAGTTATGAATGTGATGTTCATTAGAAAGAAACTGAACAACTGAG 

TACAGAAATGAATATCAGCAAGAATTTGTTTTAAGGATTTTC^ 

GTGTGTGCCATATGCCAAGAACTAACAGAACTTCCTGTTCCATCTCTGTACCTTCT^ 

AAAGTGAACTTATGGGTCGGTTCATTNCTGGGGGCCTTATC 
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RCT-279 




A ivjACAi\jA i 1 Au<jAA1 1 1 AATACGACTC ACTATAGGGAATTTGGCCCTCGAGa^CAAGAATTCG 
GCACGAGGGTGTGACTGATGGTGGAAGCAGGCCTTTCTTTTATTATCCAAACCTTCTC 
ACCATATCTCCTTTAGTACCAACTTCAAACACCT^ 
GTTTTTTTCTGAGTCTTACTTTCTCACTGAAACTTTTGTTTC 

C^CCTCCCCT^CCCTCCTCCCCCTCCAGAGCACAGCTCACTGGCCGCTTCAAGCCTTGTGCTGG 
CTTACTTTGTTTGTAGGGGCTATTATAAAAGCTAATTC 

GTGTTGTATTCCAGCCCAACAAAATGCATGTCTATAACCCCAGCCTCTAGGAGGCAAGGCTAACT 

ggggcccaaggattgccaggaattgacagccagcccgagctacataacaagatggaaagaaaaca 

CTGGTGATATTTTAATCTGATACATTGGTTGGAGGAGTATTTATATGAACATC 

aagaagttattttattggctatgatacagaatatctaagcccc^ 

TCTTACTTTCTTATCTCGGCAGANAACCCTGGTGGTAAGTGGATTTTT^ 
ATCTCTGGCTGAATaATTGGGaaaANATTCTTCTAAGCGNGGGGAA 


RCT-28 




L^JW l A i\» At. A I\» ATTACGAATTTAATACG actc ACTATAGGGAATTTGGCCCTCGAGGCCAAGAA 

ttcggcacgagggaattcagcctcgcggacggctgtcatttcattttgaagttc 
agttgacctgaaatctttatcacagtaataaactctttaatcggtttttta^ 
tccctttgtttacagttactctatatggcattgtgcaatgtttcattg 
cctgttttctattctttggaatcctttcttggtc 

GAGGACCCCCATTGTC^GaaGT^TATTTTCCA 

GCATTTCANCAGTTTGCAGaAGAAGTGATTTGAATGGACAGCTGCAAGTGAGACAGCCTAGACCT 
GAAGAGATGAAACACTCACITCACTTGGAGGaaAGAAGGAAGCAAGCCTAGGGCTTAACAGACGT 

atctgcagtacgtccagccctcccagagcctggccccanccattggacgccctcggctttcattg 
ccttggacagcggatgtaccaggnctggcccagcgggatcctgcnaaggctgaacatgaagagct 
atgaagagtacc agttggng at anatggggg aac cccn ggg c cnagnttgggtttcgatgtc ac a 


o 

CO 
CN 
1 

b 




aTCTATGACATGATTACGAATTTAATACGACTCACTA^ 
TTCGGCACGAGGTCCTCTCCCACGCCATCTGCGCATCCTTCTTC 

CCTGCCTCTTTCTGCCCCTGAAATTTGCACTATGTCTCTGGAGGGGAACACTGGGCAGAGGGCGT 

GTGATGTGGGGTTACAGCCCCCCACCCCATTTAGACACAAGGATGCTGGGTCTCTATGCGGATGG 

GGACAATGTTTACAGGCACCCAGTCACACATTCACGTGTGCACACAGGCACACGACGAGAGGCAC 

CCC AGCACATAG CTTGTAGTTTTTGC AATTGTCTTC TCC AGGTAATAGGATGG AC AC AAAGG 

CACACCCCCCAGTTTaAGAAAAGAGTCCATCCCAACCCCCTCCCTGCTCCTCCCCTGGAGCAGGG 

CACCCATCCCTCCTGCAGACCCTTGCCTGGTGGTGAGCAGGGTTTTACTGTC^ 

ttactttctttctatttggtttgtggtgagcttgtctgtc 

TGCCCaaaAGATATTaAAAAAaAGACCAAaATATGTGaAT 


RCT-281 




CTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGaATT 

CGGCACGAGGGTTGGATGaAGAGAACATATGAGCTTGTGAGAGAAGACCGTGACCAGCAGCGATT 

TTTGTGACGTGGAGCACTGCTGACTCATAAAGGGAAGACAGAGaATCTTTTAGAGATCGCATC 

TTTCAGAaAGGCTTGGCCCCATACAGCCTGTTGTTGTTGGACATTCATAGTAG 

GCTTGTTGTATTTGAaAGAAAGAAAaAAAGCATATTGCTAaAAAATCTGGCTGA 

TGaATGGCAGGATGTGGGaaaAATGGATGGTTGGTCATTCAGATGTCTAGTGATACAAAGACAGA 

TGAGTGTGGCCCCAAGCGCTGGCACTTGCTGTGTTTTAGGGGaATCGTATTGGTGGCAC 

tatttctaatatgtattaaagctgtgtatcttga 

CTCTCTTaAAATGCCAAGAACCTCTCTTTTGCTGTC^ 

ttagttacaggttgtcattgacctttaggaattaaatctgaggggt 


RCT-284 




TNCTAACATGATTACGANTTTaATACGACTCACTATAGGGaaTTTGGCCCTCGAGGCCAAGaA 

cggcacgaggggttactccattcaaaataacatactttgaaagcaagtata 

TATTATTTTTCTTCTTAGCTTCCCCATTGTCTGAATTGGGAAAAC 

CACCTGCAAAATGGTTTAATGCCCCTGCATAGTTCCATATCTTTCAACAATAGATTTAG 

AATCTAAACTAGACACCCTGAGAACATCTGTCCTGTCCCCAGCTCCTAAACCCAGGCTTTGATTA 

TGTGTGGCTTGTGAATCCTATCAACCAAAACAGGGGGACAGACATACCTCACCAACTGTATACCC 

TGATGACTCCTTACTCAAGGGCTTTTTGAGTACCTGTTCTTGATAGTACCTGTCTA 

GGGACCCTGGAGCTTTCATCCTTCCCATCTTACTTGCAGGGCGGCAAGTGGCTCCTCT^ 

TTACCGAGCCCCCCTCCAAGCTTAAGTTCATTTGCGGATCAGGGATTAAGCCTGGAATTGTC 

TCCCTGGTGTCAGGGGTTATTGTAAAATGGTAGTAATCTCACCCCAAGCCCTCAGTAAGAACATA 

AATATTTAAAAAATATGNGCATTTGNAATCTGGTTCTGGATC 

AGGAGTGGACTTTAATCTTCTAGTGAATAATTGCCCACTTTGNGGGAAGGN 
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RCT-287 




TATGACATGATTACGAATTTAATACGACTCACTATAGGG7VATTTGGCCCTCGAGGCCAAGAATTC 

GGCACGAGGCCAAAACCTGACGGACAGATCAGTTTTGACCTCTTATCCTCTGTGGCTCTC 

TACTAATCATGAACATGACCAGCCAGCACATTTAACCTTGAAGGATGACAGCATACCTGTTAATA 

G AAATC TGTC AAT AT ATG ATGGGC CTG AGC AG CG ATTC TGC C CTGC AGGAGTTTATG AATTTGTT 

CCTCTGGAACAAGGTCATGGATTTCGGTTACAGATAAATGCTCAGAACTGTGTGCATTC 

ATGTGATATCAAAGACCCAAGTCAAAATATTAACTGGGTGGTCCCAGAAGGTGGAGGAGGACCTG 

CTTACAATGGCATGTAAAGCCCAAGTGCCTCCACTTACTGGCACACTTGACAGCCAGTTTCTAGA 

ATACTGTAAATGTATGCCAAACTAACCTCCCATATGTTTGGA 

AAACACTGAAGTAAAAAACTTTGTATCTAACGTCCCATAAAATCATGAAATATTTGTCATTAATA 
AAACTTTATAAATAAATAAAAAAAAAAAAAAAAAATCTCGCGGCCGCAAGCTTATTCCTTTAGTG 
AGGGTTAATTTTAGCTTGGCCTGGCCGTCGTTTTTCAACGTCC 


RCT-288 




TCCTNATNAGATGATNACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCAAGAA 

TTCGGCACGAGGAATTAAAACTATATAAATAAGAATAATAGAAAATAAGTTAAGGAAATAAACAG 

TAGTAAAACACCTCTGGTTTATCAAACGTTTAATCATAGAAGGAAACCCTGATGTCACTTTCT^ 

ACGTATGACTCTGTGAAGGAATGTAATGGTACATTACAATAGGAGCTAATGTTTTAAATGTC 

AGTAGTGAAATAATTAACAATAAACTGGAGTTCAAAATGCCAGTCAATGTAAGTACATTCTATGA 

TGGGGCTTTGAAAGTGTTTATTCCATGAAGCAATTCTACAAAGAACATTGATG 

AAACTGTTTGGAAGGTGCTGGGCAAATAACTGGAATTGTCTAAGTGGCTTCACCGCACTGTACCA 

GAAACATATTCTGAAAGTCAGATCCATCAGTGCTCACTGTGCTGCCGAACTTCACAGTAATTTAC 

TTTACTGTTGTGAAAAATAAACATCGCTCTTGTAAACTGTGGTGTTAACATTT^ 

AAAGGAGGCATTCTTTTTACAAAAGAGAAATGCTTTATCTTTCAGAAAAAAAATC 

ATCTTATCCATTATCTGAATGTTGATTCCTTTGCTTATAAGTTTTAGG 

ATTGCTTCTC TT 


RCT-291 




GTCATCTTCAGCTATGCAGTGAATATXSAGGCCAGTCTGGACTACAGGAAACCOTGTATTGGACAG 

AGCTAGAAGATCATACAATCAGGAATGTGGGTGTAACAGCACTTACTTTTAAGGATAATGGATAA 

AACTCGAAAGAACATGAATTCGGAATGGTCTAGTTTTAGTGGTGTCTGTTC 

GCGGCTCTAAGATGACATTTAAAACGAAAATACTGCTGACTTTAAAAGGGAGGGAAAATATGG^ 

AGTTAC ATGTAATAAAC CAATTAAGAGGTAGTGTTGGGGCTGC C TC T ACAC AGTGC CACGTTC TG 

GCCAAGAATG1TCTCTACTCATTTAAGGTCAGTTCCAGTACAGTCAGAATCCAACTGCCTCATGA 

CCTCCTCTGCCACTTCACTCACATATAACTAAAGCATGACAAACACTATGGTCCTGAAAAGTGTG 

AAATCTACTGTCTGTTTCATGTGCTTATAAAAAATCAACTCCCCTGTGTATCCCACACGCTCCAG 

ATTC AGTTGTCC AAATC AGTCCAGAATTTCAGAGGAAC AC AC CTCGTGCCGAATTCTTGGCCT 

AGGGCC AAATTCCCTATAGGAGTCGTATTAAATTCGNATC ANGTNAATCNNNNG 


RCT-292 




TCTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAAT 
TCGGCACGAGGCACGTACTAAGCAGACCGCTCGAAAGTCCACGGGCGGCAAAGCCCCGCGCAAGC 
AGCTCGCCACCAAGGCCGCCCGCAAGAGCGCTCCGGCCACCGGCGGCGTGAAGAAGCCCCACCGC 
TACCGTCCCGGCACCGTGGCTCTGCGCGAGATCCGGCGCTACCAGAAGTCCACCGAGCTGCTGAT 
CCGCAAGCTGCCGTTCCAGCGCCTGGTGCGCGAAATCGCGCAGGACTTCAAGACCGACCTTCGCT 
TCCAGAGCTCGGCGGTCATGGCCCTTCAGGAGGCCAGCGAGGCCTACCTTGTGGGTCTGTTTGA^ 
GACACCAACCTGTGCGCCATCCACGCCAAGCGTGTGACCATCATGCCCAAGGACATCCAGCTGGC 
CCGCCGCATTCGTGGAGAGAGAGCTTAAACGGTCCTACGAGCAGTTAACCCAAAGGCTCTTTTCA 
GAGCCACACNANTNNATNANTAGNAANNNNAANAAAACAATTGC 

A\7 l\je\\3\y\j X X nn XXX XxLlw X loiJvn\» X wVa\-«V«V3 1 v^o x X X x A^nnuu x i*vj x vj/»n» x \j\3\3t\n£\r\s—\* x vj\j\^ 

GTTACCCAATTTATCCCTTGCAGCACATCCCCCTTTCGCCAGCTGGNGTAATAAC 


RCT-293 




GNNOTGTCTATGAC ATGATTACGAATTTAAT ACG AC TC ACTATAGGGAATTTGGC CCTCG AGGCC 

AAGAATTCGGCACGAGGCCTCGCTCCTCAACTTGGCAAAAATGCCTACAGAGACTGAGAGATGCA 

TCGAGTCCCTGATTGCTGT/TTTCCAGAAGTACAGTGGGAAGGATGGAAATAGCTGTCATCTCTC 

AAAACTGAGTTCCTTTCCTTCATGAACACGGAGCTGGCCGCCTTCACGAAGAACCAGAAGGACCC 

CGGTGTCCTCGACCGCATGATGAAGAAGCTGGACCTCAACAGTGATGGGCAGCTAGATTTCCAAG 

AGTTTCTCAACCTTATTGGTGK5CTTAGCTATAGCATGCCATGAGTCCTTC 

AAGCGTATCTAACCCTCTCCATTCCCTTCCAGCCACC^GTCATCGCCTCCTCCACTX:CTTCCCC 
CATCCACACCTGCACTGAGCCCACCACACCTACCACACATGCAGCCCACGCCTGACAGGGAAAAT 
AAAACAATGTCATTTTTTTAAATGTAAAAAAAAAAAAAAAAAAAAAAAAATC 
GCTT ATTC CCTTT AGTGAGGGTTAATTTT AGC TTGGCACTGGCCGNCGTTT 
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RCT-296 




TATCACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC 
GGCACGAGGATAACATTAATCTTTTTAACAAAAATATGGTCACTTTATCTAAACCCATCCTATCC 
CCAAGTTTAACCAGGATAAGCTATTTTCATTGCCAAAC^ 

GATTTTCTGATATTCCTCCCTGTAAATCTGAATAAATTCAGCAAGTAATAACAATGCCACAATTT 

AAATAATGTTTTCTTGAAAGGAATCATCCAGGGAATACCTTTCCCTCTAACTTCTTTTACTTC 

CCTGAACAGGCAGGTGAACCTATACATCCCGAAATTCTCCATATCTGATACCTATGACCTTAAAG 

ACATGCTGGAAGACCTGAACATTAAGGACTTGCTCACC7VACCAATCAGATTTCTCAGGCAACACC 

AAAGATGTTCCCTTGACATTAACGATGGTCCACAAGGCCATGCTACAACTGGATGAAGGGAATGT 

GTTGCCTAATTCTACCAACGGGGCTCCCCTACACCTGCGCTCTGAACCACTTGACATCAAGTTCA 

ACAAGCCCTTCATCCTCCTGCTCTU 1X5ACAAGTT(-ACA1\j«aw-AIjCu 


RCT-31 




TCTATNACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCT^AGAAT 

TCGGCACGAGGGAACGTTCTCTTCACATCTTTTTACAGGATGAAATCATAGATAAAAGC 

CCATCTAAAATAAGACATGCCTGAGTTTGGCAAGAAGAGGCAAACCCCAAGACCATGAATGATTC 

GTCCGTACCAATGAAGACGTTGTAACTAACTTAGCATGCTGCAGACACTGCTGTAACAAACGCGC 

ACGCCCTGGATGTACTAACACTCACGCTCCCTTTCATTCTGTCTGTTTGGC 

CTTTAGTGGAAACTCCCTCACTTCTCCAGCCTTTCTAAGTAGCCCTTCCCACCGTC 

GTGCACTX5TAGCCTCAGCCCTGTAGACCTGCAGTGTTCTGACTAAAGCTGCCGACTTGCTCGAAT 

TTGCAGTTTCTGTGTCGTGTGCTTCTTGAATCTGATTC 

TGTGATGTCACCACTTGGTCCTAATTTATGTGCAGGAGCACGACACCATTTGTCTCCAGTGCCAC 

ACATACGAGGTGTACTTTGTGTGCAGAATGTGTCTTCCTTC 

GAGATTAATGAGGGAAATCTTTATATTCTGTATAAA 


RCT-34 




ACTAAANNAANNCTCNTATGACNTGNTNACGAATTTAATACGACTCACTATAGGGAATT^ 

ctcgaggccaagaattcggcacgaggcaaaagaaactacaaatcctagattcgtctgaatataca 
gactcagagaatatttagt^catctgaaaac^^ 

ctgtcaaatgtctgctctccattaccacctgtctgacctctgctgagaacagtcgtcagtgcagc 
caccaggtttccgccctctctcaagtttc 

tcatttcattccagattttcccaggggaatagtctgcatcctgcttgctttctgtataaaactta 
c aaatc aatc atgaaatgcnc taatttttgtg aatc agg ac c c taagtgtttaattgnaaaataa 

TTTTGTTTCATAGATTGCTTTAGGTTTGTGTTATCTTTT^ 

NTTC TAAAGT AACTGGAACTGCAGATTAAAGC C AAGGC TTTAC AG ATAAAAAAATAATTTACTGG 

NAATGNCTAAATTTTGGAAAGAATGGCAATTAAAGCCCACTTTCAGGCTTAAAGCCTTT^ 

CACC1WTANTCATCTACACCTCNCCCNTCAAGNNGTGGGCGGCCCCCCANANCCTTTATTTC 


RCT-36 




TTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATT 

CGGCACGAGGGCGCGCAGGGGAGCGTGCCCACGGGACTGTTGGGCCGAAGTCTACCTGGTTTTAG 

AACGATCCAAAGAAATGGCAGGAATGCATTCACTGTATTCCTCTTTCTAGCCCTGCTCCAGATCT 

CTCACAATGAACTGCTACCCGCAAACAATTTATAGTACTCAATGGCAGACATCTGTCTCCTCCCA 

AACGAACGGTTGGCTGGAGGACTCTAACATGCCCCAACTGTGCACACCTCCTCCTTTAGATTAAA 

CCTCAAGGGCTCCCTTCAGCTGAGTGCAATGTGTCTGATTC 

GGGAGTTTAAGGTCAATGTGATTTTCAATATGGAGCTAAGGAGAGAAA 

TGCCTTCTATATACACACACCTATGAGCAATAAAAATGACACTTTTCCCGAAAAAAAAAAAAA 
AAAAAAAAAAAAAAAGGTTGTGCGGCCGCAAGCTTATTCCCTTTAGNGAGGGTTAATTTT 
GCACTGGCCGTCGTTTTACAACGTCGTGGACTGGGAAAACCCTGGCGTTACCCAACTTAATCCCT 
TGCAGCN 


RCT-38 




mwamp a p ath ATTAPfi A ATTT A AT ACG ACTCACTATAGGGAATTTGGCCCTCGAGGCC AAGAATT 
CGGCACGAGGGTCCTTCCCTATCAAAGCCAGATGCTTGAGAAGCCATGAAAGAGACCTCTGAAGT 
GACAGAAAGGAGGAAACAGCCTCAAGCCCCATCTGGAATCTTCCTGGCTGCTGTCCTCAGCCCGT 
TCTTCTGGCTGTTGAGCATCGATCAGCTGTCGTCCCTTCCAATTO 

TATGCCCACTAGATGCGGTGGAGATGCAGAGGCATCCGGACCCCACGCCCCACCCCCTCCCCTCA 
CACACTTACTCTCTGCCTAGTAATGCCACAGAGCTTCCATCCCCATCCAAAGGTCATCAGGCATG 
GCTATCAGTTGGCTCTCAGGGTGGATTTGACArrC 

CTTGGTGAATAACTCTGGTGTCTAAACTCTGTACTTGAGTTTACAGTCTCAGTAGAGGAGAC^ 
AAAGCTGTTGCGAGTGACGGCAGGATTATTGAACAGTCATGATGCTTGGCTTTCAA^ 
TCGCTTT AAGGC TTANAATTAGTAAGGCATCTTTATAACCAGGCATAGCTAGATC 
GGAGGGCO^AAGGACCTAGAACGNGCTTCTTACCTTNCTCTCTAGOT 
TCNCCAWNCTCAGGGGAAAACTTGCTGATTCAAGTA 
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RCT-39 


r 

( 
( 
< 


rcTATC AC ATGATTACG AATTTAATACG ACTC ACTATAGGGAATTTGGC C CTCGAGGC C AAG AAT 
rcOTCACGAGGGTAAAACTGGCCAOSCTCATGTACAAGAAAAGACATGGTCCTOT 
3AAACTCCTGGGGAGCCCTGGAAACCTTGTAGAGGGCACTGGGGACCCTCATTATATACAGAAGT 
ACTGATGTGGACAAAGCTGGATACAGCTATGACCAGGCTGGAGGGACAAGAAGCAAAGGGGTAG 
3TAAAAGAGCTCATGGTGTCAACTGCAGACAAGCCAAGTTGTGAATCCTGGTCAGCACACCCAGA 
3ACTTAGTCTAGAAATCCCTCCAGGATGCCTGGATACCTGTGCTCCCACTGACCTCAGATGAGGG 
^CTGCTGTGGGACTGTGGTCCTTGGAAATCACTACCCTCTTGACGACCCAGGCACAACGGCATTA 

^GTCATTCTCTTCTCATTCATATTGTTTGCTCATG^ 

CTTGAGCATTGGCTTGTTTGGGGAATGGGGGGAGCGTTGGGAGCAGAGTC 

CCCTCAGACTGTCTTCATTTTTGGATGAGAGTGAATAACTCTTTCCACATC 


RCT-40 




Cn-ATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGATO^ 

CGGC ACG AGG AAAG ATCC AGTC ACTGGGTTAGACTACTGGATTGTC AAG AAC AGC TGGGGCTC TC 

AATGGGGTGAGAGTGGCTACTTCCGGATCCGCAGAGGAACTGATGAATGTGCAATTGAGAG^^ 

GCCATGGCAGCCATACCGATTCCTAAATTGTAGGACCTAGCTCCC7VGTGTCCCATACAGCTTTTT 

ATTATTCACAGGGTGATTTAGTCACAGGCTGGAGACTTTTACAAAGCAATATCAGAAGCTTACCA 

CTAGGTACCCTTAAAGAATT^ 

CCTCCCTATCAATCACCGAACTACT^TTCTTTTTAAAGTACTTC 
GATTGGTTAGATATTGTCAAATATTTTTGCTGGTCACCTAAAATG^ 
AAAAATCTATATAAAAGTGCAAGCTCCTTTTTTAAATTACATAAATCCCATX5 
AATAGTTATTTTTTAAAGACTTTAAAATAAATGATTTAATC 


(N 

i 




tnnnnnottcnngnnnnntctatgacatc i i ^ iACGACiCAL i*ii™~V?iii m^T 

CCCTCGAGGCCAAGAATTCGGCACGAGGGTGGGTGGGA 

agagagcaagcagagcaagcaatggggagggagc^ 

TGCATCAGCTCCTGCCACCAGGTTCCTGCTTTGCTTGGGTTCCTATCTCG 

AAACGGTGATATGGAATGTAAGCCATATAAACCTTCTCCTCAATTCGCTTACGTO 

CATCAATAATAACCCTGAGACAAGCAGGAAGCACTCTTAACCACAGAACCAACTCTCCAGCCCCA 

CGGTGACACTCTTGTCTCAAAGAAGGTAACAGCCACAACCACAAAGCCAATCAGAACCATGTGCA 

CGGAACACACAATTCTACGATGCTGGGTTAAAGATCAGAACACGTGTCTTCAGAAAAAAAAAAAA 

AAAACCTTTGCGGCCGCAAGCTTATTCCCTTTAGTGAGGGTTAATTTTAGC 

CGTTTTACAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAACTTAATCGCCTTGCAGCACA 

^^i-«mfTvn^i r*r* r» r*r>rnr^r*n nn tmti t a rir'r* A AMAf^NeCCM^ACCGATCGCCCTTTCCAACAGTTN 
CCCCCTTTCGCCAGCTGGGCGTNAI A^wwwwwvjnwv* vj^ft^wn* v»wvv w * * * » » — 


RCT-43 




GGCACGAGGCAACAGAACCCAAAGAGGATGCTTTCCGGAAGCOTTTCCGC 

CGGCGGGGTACGGCGGACCTAGGAGCGGTCATCGACTTCTCAGAGGCTCACGTGACTCAGAGCCC 
GAAGCCCGGCGTGCCCAAGGTGGTC^GATTCCCTCTGAACGTGTCCTCAGTGACTGAGCATGATA 
CCTCTAGGGCAGGACTTCAACCTGCCTGGAGAC^CCTCTCCCAGCTGTCCTCCCTAGCGGATCM 
TGCTTGAGCCCTGTTCTGTGGAAGACTGGCAGGTGTGTGCCACCTACTTGAAAACTGCCCGAGTT 
AATATGACTGTTCGGC AGGTAC TGGCC AC AGGCC AGGACTTTCCTTTAGAACC C ATGGAGG AGCC 
AAAAAGAGACATTACTGCAGATGGTTTGTGCCATCTGCATGACCAGAAA^ 

AGAGGTTAAATACCAACGACTAAAGCTTAGGGGACTCTCTGGTTGCCTCACACAGG^^G^AA 
GAACTCACCTGTCATGCTTCTGCCTTGGGCTCATC 

ATGTCCTACTCGGCCAGTTCTAGCTCTTATTANGNGNGNGNGNGTTTGTCATACCC 


RCT-45 




TNNTOACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCT 

CGGCACGAGGCCAGGCTACCCTGTCTCCTCCAGGCACACAGGTGCACACACAACTCATGCACATG 

CATATAAATGACCACAAATATATACTGACACCCCACACATATGTAGACACCTGTATATGCACAGG 

TCCC ACT ACCACAC AAACTC AC CTGGG AAGGTTTTCTTTAGATGAAGC GTTCCCTACCTGGTATC 

TCTTCCTTCCCAGCCTGGTGCCTGGTGGTAGTGCCTCTCTGAACAGTGCTGGGTAGAGGTGAGAG 

GAAGCCCTACITTCTACCCTCTCCCCAGCCTCCTCTAGGCTGTGGCTACAATCTCAOTCTC 

AGAGACTGCTTGGTCAGCTTCCTACCATAACACCTACCCCAGGGCTTATCCTACTGGGAGCTGAG 

ACAGGTCCCAATGAGGTCCCACTGTGCACTGTGCCT^^ 

AACATCACCATTCTGCCACTTACAAGGTGGAGAAAGACAGGTCTGGTGGCTTANGCATGACCTGG 

ANGTCCTCACAGCCCATTTAGCCTGTCTCAATGTCCTCAGTTTGGCG^ 

CANANACCCNCCTNAANTTCANGGGGAG 
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RCT-49 




TTAT C CATG ATT ACG AATTTAATACG AC TC ACTATAGGGAATTTGGCC CTCG AGGCC AAG AATTC 

GGCACGAGGGTGAGGTGTTCTCGTGGTTACGATAGGTCTCTTCCCTGTGATATTCATTTGCAGAT 

GGCTGGACTGATCAAGCAGTACAGAATGGAGGTCGGAGGGAGAGAAGGTCCTCCAGGGAGATGAG 

AAATCGCCGAGCACCTTAAGTCTCAAGGTTTGCTGACGGCCAAGACCAGGCTTTGAATGAATGGT 

GAACTCAGAGGGGAGCGCGTTGGCCTGAGGAACCCACGGATGCCAGTGTTGGTCTATTCTTGCTT 

TCAGGTACCCCTTGGAACACAGAATAGCAGTCTAGTCCTGCTGCCACCCCCACAAGGCTGGGCAT 

GGTTCAAAGGCATGCAGGATGCAAAGAAGAGTCAGCTTTGGCTGGGGAGGAGTGGTTTGGTGTAC 

ACTGCTACTGAAATAGAAACTTTTGGCCTTCTGTCTGTAGAAATAAAAATCTGACTTGGTGATGT 

TTTTAAAAAAAAAAAAAAAAAAAAAACATTGCGGCCGCAAGCTTATTCCCTTT 

TTTTACrTCGCACTGGCCGTCGTTTTACAACGTCGTGACTGGG^ 

TAATCCCTTGCACACATTCCCCTTTCGCCAGT 


o 

in 

i 




TATGACATGATTACGAATTTAATACGACTCACTATAGGGAA1TTGGCCCTCGAGGCCAAGAATTC 

GGCACGAGGGCAATCATGGCTCCGGGTTGGCCGCGGCCTCTGCCGCAGCTCCTCGTGTTGGGATT 

CGGGTTGGTGTTGATACGCGCCACGGCCGGGGAGCAAGCACCAGGCAACGCCCCATGCTCAAGCG 

GCAGCTCCTGGAGCGCGGACCTCGACAAGTGCATGGACTGCGCTTCTTGTCCAGCGCGACCACAC 

AGCGACTTCTGCCTGGGATGCGCAGCAGCACCTCCTGCCCACTTCAGGATGCTATGGCCCATTCT 

GGGAGGCGCTCTTAGTCTGGCCCTGGTTTTGGCGCTGGTTTCTGGTTTX:CTO 

GCCGCCGGAGAGAAAAGTTTACTACCCCCATAGAGGAGACTGGTGGAGAAGCTGCCCAGGTGTGG 

CACTGATCCAGTGAGGAGCACCCGCGCTGGTGCCCATTCATCGTCCATTCATTCATTCTGGAGCC 

AGCCTGGCTTTCCAGAGACAAGCCGCGCCAGACTCTTCCAACCACAAGGGGGTGGGGCGAGGTGG 

TGATTCACCTCCAAGGACTGGGCTTANGGTTCAGGGGANCCTTCCAGGGTGTCTAATTGCCCTGT 

CTCTGGNTCTGGGGCAGACAGANANCCTCAAGCTAGGTCACAA 


RCT-53 




ATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTCG 
GCACGAGGATTTATTTACTTTATGTATGTGAGTACACTGTTGCTATCTTCAGACACACCAGAAGA 
AGAGGGCATCAGATCCCATTACAGATGGTTGTGAGCCACCAAGTGGTTGCTGGGATTTGAACTCA 
GGACCTCTGGAAGGGCAGTCAGTGTTCCTAACCGCTGAGCCATCCCTCCAGCCCCAGCCTGTTTT 
TATGGAAGTGATTCTCAACTCATGGGTCATGACCCCTTTGGGGGTTAAATGACCCTTTCACATAT 
CAAATATCAAATCAAATACCCTGCAGAGCAGATATTCACATTGCAATCCGCAACAGCAGCAAAAT 
TACAGTTACGAAGTAGCAAAGAAAATAATCTTACGGTTGAAGGTCATCACAACACGAGGACCTGT 
ACTACAGAGGTCTCGGTGTCAGGAAGGTCTAGAACCTGCTGTCATGGGGGGGTTGCAGATCAGCC 
GGGGC TAC AC AATGAGACTC ATTCTC AAG AAAGAAAAAAAAATG AC AGGAAATAT AAGC CTGACT 
GTGCGCTCCACCAGCTCTACCCCAGCCCTCACCTACAACCACGCAAGGATCTGCTTCGTTCTCCG 
AGAGAGTGTACTTCCCACATTGCTTTATGCCTCNAAGTCATCCCTACNATGNGGGCC 


RCT-59 




CCNNNTTCTATCACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCC 

AAGAATTCGGCACGAGGCAAGATGGTGCCTCGGGGGCTGAGGGAGCTCACAGGAACTGAGCAGTG 

ACTGGTCCTTTCCCAGTATTGAATACTGAGCCCCTGTGGGTGTCGAAGCACTTAGTGGGTCTGGC 

CCCAACCCCAAACACCCCTGTTTCTGTAACACCCTGAGCTGGACTGTTTATCTTTAGCCGGGAGA 

ACATGTATTTTGGTCCCTTCCCTGTCTCCGCTCAGATTGTAAACCTCCCACGTGTGGGGATCACA 

CCCTGCACTGTCCCGAATCTTTACACCCTATCCCAAAGCTGGTGCTCAATAAATACTTCTAGATG 

ATTAAAAAAAAAAAAAAAAAAAAAAAAAAATTGGAAGCGGCCGCAAGCTTATTCCCTTTAGTGA 

GGTTAATTTTACCTTGGCACTGGCCGCCGTTTTACAACGTCGTGACTGGGAAAACCCTGGCGGTA 

CCCAACTTAOT^GCCTTGCAGCACATCCCCCTTTCGCCAGCTGGGCGTAATACCGAAAAGGCCCG 

CACCGATCGCCCTTTCCCAANAATTGCNCAACCTGANTGNCAAATGGGACCCCCCCCTGTAACGG 

CNNATTTAANCGCGGGC 


RCT-6 




TTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATT 

CGGCACGAGGTGCTAGCCCTAGAGAGCAGTGCTCACTTCAGACCAACCAGCCTCCGGTGCTTCTC 

GCCCGAGCAAGAAAGTTTTCAGCGAGGAAAGTAAGTTTCCATCTGTCCAGCCATGGGAGAGGACG 

CTGCACAGGCAGAAAAGTTCCAGCACCCAAATACAGACATGCTCCAGGAGAAGCCATCCAACCCC 

AGTCCAATGCCTTCCTCCACACCGAGCCCCAGCCTGAATCTGGGGTCCACAGAGGAGGCCATCCG 

AGACAACTCACAGGTGAATGCTGTCACCGTGCACACTCTCCTGGATAAATTGGTCAACATGCTGG 

ACGCCGTGAGGGAGAACCAGCACAACATGGAGCAGCGGCAGATCAACCTGGAGGGCTCGGTGAAG 

GGCATCCAGAATGACCTCACCAAGCTCTCCAAGTACCAGGCCTCCACCAGCAACACAGTGAGCAA 

GCTTGCTGNAGAAGTNCTCGCAAGGTCAGCGCTCACACGCGTGCTGTTCGGGAGCGCCTAGANAA 

GCAGTGTGTACAGGTGAAGAAGCTGGAGAACAACCANGCTCAACTCCTCCGANGCAACCACTTCA 

AAGTGCTCATCTTCCAGGAAGAAAGTGAGATCCCTGCCAGTGTGGTTTC 

AGCACOTGCNNAAAGCNAGGACTTGCTTGATGAAAACNAGNCCTTGGGAGGAAACTNT 
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RCT-60 


1 

; 
c 
c 
j 


rcTATGACATCATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCC^GAAT 

PTGTACAGGGOTATGAGAAGGGCCAACCTGACGAGGAGCTCAAGCCCAAGAAGAAAGTCTTTCAG 
VAGCTCCJ^CCGAC^ 

^ccSgctcgg^tc^ctgtaaatcactaaaagggggcaacatcagctagccgctcg^ 

ktcccaccatcctgtctgccggtcgttccacctctcacccgctcccatctcaggacactgaagc 

^^gtc^a^ctcactagacgcaggacttggaaagggacacktcaccttcctaccatgtgg 

sctcatcc^cctcg^aaKggagacggagi^ct^ 

^tctaccttcttcccttggcagctgacttgagaaatctg^^ 

S^cctcggtccttgtaattggaaaaacactggttcccaagattccatggggt^ 


RCT-61 




2NNNNNNNAATNNNNNNNNTNCT 
3?rcCCCCATNGTTTGAATTCTCCT 

CTCTGTCTG^G^TGAGGGGOTCCCGAGAGGOTCCGGAGCCCTGCGTGAAAGCAAGCTCTC 

tattcaaGACAAACATTCNCCANTAAAGAAAAAAAAAAGAGAAAGATC 
TTTOTTTCACAAGAGTOTOACTOAATTTTTTGGCCTA 

C^AAANAAAAAGA^ 
A??TTANCTTGGCCCTGN^ 

rTTAATCGCCTTGCAGCACATCCCCC .„..„ 


RCT-62 




ATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTCG 
gScGAGGATC^ 

rTHA'TCTTOACC^AAATTCTACCAAAGA 

aga^?a^gtcctctagtctcaagaccttttttgtttgtttgagac 

GGAGCTTGCTGTATAGACCAGGCrrGGCCTCAAATAGAGACTTGCCTGCTTCTGCC 

tcggaot^ggtctctgctctcacaccctgcccctgttgagcatactgacggcagaactctgat 

C^A^AATGCTCATCT&GACATCTCAGTrTATGCTTTCG 

ggKcctgaagcaccacattaaattttgaactatt^ 

ataagtaagttcanaattctctgggggaaaaccatnaag ' 


RCT-64 




CTATCACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGC 

CGGCACGAGGCTACTAGGTGGGCACAGCCTCTCCTCACCTGGGGACTTCAGGGTTTGTCCCAGTG 

A^TA^CTG^CTCCCTCTCTCAGGCAAACTGAATGGGGCATCCTATGTTC 

G^^^^T^^GCTTGGGATK^GATCGTGTCTCGGCGCTCCCTGTCTGATCAGA 

AACGTCTTTTTCTAATGAGGCTTATTAGACTTC 

CAGCCATCTOMWACAGGCTATCTCCAGAGGT^ 
rATCGCTCGCAGMACCAACTGCCTCGGACTCCGGGGCTCTGTAAG^ 


RCT-66 




TCTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAAT 
^S^CGAGGA^AC^^ 

CTATTTGGGGGAACTTCGAAGAATACTTAGAAATGGCCGTGGATTTGTCACACGG 
A^GACM^GGACCTTGCTTGGGTTCCTGGGGTGCCCACTGTCCAGCCA 
GGAGC^CTGTGCTATGGCACCGGGCCCGGTCTTCACTGTATATTC 
OTA^CTCTOITTATTTCGCTCACCCTGACTGCTGCACTTGTCGGTGACGTTTC 

CTGTGTOTCTTGCTTTCGGGGTGGGAACTGCTITCAAACCCTTGTTCGGATC 
A^AATGTCTT^TGT^TGGATOAAATTCCTTTTTGA 
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RCT-68 




rATTACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC 
3GCACGAGGTTGATAGAATCAGTAAGTTTCTAAGGGAAAGGAAATTGATATTTTGCAGACCAA1^ 
rcTTCTAACCAGCATCCCAACTCTAGCTCTGTCAGCCACGTTACCGCATCCACCCTTTACTGCAT 
SCTCAGGTCGCTGCAGTCTGGTTCTCCTGGGAGATTTTCATCATGTAGCTATTGGATACAATTAT 
3AAAACCAACTGTTGAACATATTTGGAGTAGCTGTTTCTTTCCTAG 

CTGGTAGAACAGGTTGAAGCCCGCCTGCATTAGCTGTGCTITCCGTTATGTTTAGAGGGATGCAC 

AGGCACGACATCATTCCAGGAAGGAAAATTGTGGTTAAGAATTTCCAGTAAGATCATACTTAATA 

GCTGAGATTTTTAAGGCATTTTTATGTTTTCAATGACATAACAAAAGT 

AAGGTGAACAAACCCTAGGTTCTCTTCAGACGTTAATTGATAGTATTTAATGTACGTO 

AACCTCCATTAATGACATCTTTTCTTTGTGGTANGGCCTACCTTCTGCTTTCCTGGAA 

AATATACATCATTTGAACCTATTTTGAGTTTTCTGTTGGGGGGGGGGGGGGGTT 


RCT-69 




TATGACATOATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC 

GGCACGAGGGTGAAGGAGGAGGATCTGTTCCATAGCTACACAACCATCATGGCTCTCGTTCACCT 

GGGGGCCTCCAAGTGAGCTGCCTTGTTGAGTGGATTTCACATTA 

TTTTTAGAAAAATGGAATCTCGGATACAAATATTTTAAACA 

GCTTTGATTGATGCTCTGAGGTCACTCCGGGATGATTTTTTAATACGCA 

AACATTGTGAGCACACAAGTCAAGAAATGAGGCTGTTGTTAGTGCAAGCAGCCGCTGTGGCTGAA 

CAGATTCTGCGGCCACTGCGTAAGTGCCCTGGGAGGTTGAGAGGCCACCCGGAAGGTGGAGGTTC 

AAGTAGTGGTTGGTGACTTTTAGCAGAATCATGAGAAAATTAAAGAAATGGTTTGAGAAGTC 

CTCTGCTTTTCTGTTTTTTACAGAAGCTGTAACAAATATTACT 

TTAGTGTTTTCCTCTTCCTAGTAAAGAGACTGCAAGGG 

ATCATATCACCATATTTTTGGTTTTACACCAATAAAATATACTTGAAAAC 


RCT-72 




TATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC 
GGCACGAGGCTCAGACACTGAGGATTGGAAATAAGGGCCTGGAGTAATACCCAGTCTTCAGTAAC 
CATAGGCCCAGCCAGAGACCTGTGGTGGTTCAGCCACAGCAGGGGCTGGGCAGAAAACCCCTCTG 
AGCTGCCAAAGCTTGTAAGTGCCAAGTAAGCCCCTTTCCTCCCAGCATGCTCTGGCTGAAGGGGT 
TGGCCCTGCCCTGACACCTTCTCAGTGCCTCTCCCTATGCTTCCCCAATGATGGTGCCCAGCCTC 
GGGCTGGGCCCACATATGCCAGTATGAAGGCCGCCTGGATGAAGAGGGCAGGCTCCCACTCCCTT 
CATTTCCCCTAATGGGTGCTGGCCTCCCCACAAGTTTCAAGACTGACATTTCAAGGCAGCTCTCA 

GGACTGCCATTTTTCTCTAC^CCTGTGGGTTTAGCTTTTGTA 

CAAAAGTGTTTCTATCTTACTGGTTGAATTTAGCTTACTAATTCACACACCTGGAAGCAGCAAC 
TGTAGCATGTAACCAAGCCAGTTGGTTACTCTGGAAATGGGACAGTATGATGCCATCCCATGTTG 
AACACTTGCGTCCNATNAAACAACGCTGGAATTC 


RCT-74 




CTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATT 
CGGCACGAGGCAACTTTTTTCCGAGATATCCGCCAGACCCCTC 

ATACTTTTGTCCCCTC^GGCGATGGTTTGCCTGCAAAATCCGCTCAGTGGTGAGCCTGCTC 

ACACCTGAGAAGGCATGACTCCTCCCAATAACTAGCCAGGTGGACCAAGGAACCCGGCTCCCCAT 

TCCCAGCAATGGGACCCATCGCGGAACCATCGGCACCCACACCAAGTCCTCTCATGACTCAAAGT 

CCACTGCAGCCTAGGAGGGTTGTTTCCCAGAAGAAAGGGATATAGGCTAATGCTCTGTAAACTGG 

GAACATTCAATTTCTTCAGAGGCTCTTCCAAAAAATGGCTCAGCAACGCAAATGTTTC 

TCAATATGCAGGATAATTTTGGGGTTTGAA^ 
TATCCAAATAAAAGTCATTTATTTTACT^^ 

CGCAAGCTTATTCCCTTTAGNGAGGGGTTAATTTTAGCTTGGCACTGGCCGTCGTO 

CGNGACT ^ 


RCT-76 




ANCNGAGANNNNNTNNTCACATGATTACGNATTNAATACGACTCACTATAGGGGAATTTGGCCCT 

CGAGGCCAAGAATTCGGCACGAGGCGCTTGCCTAGCAAGTGCAAGGCCCTGGGTTCGGTCCCCAG 

CTCTGAAAAAAAAAACAAAAAAAAAACAAAACTTCTGTAANTAGTTGGATTTAGGCCCCACTCCT 

CAGCACTGTGAGATGAATTCACGTCATTTATGTTTCACAGNGTTTTTGGAAAGCATAGCTTGTAA 

CATCAGTTATACTACTTCTTGATGACTTCATGTAACCTTCGAAGTTAACGNGAAGCGATGCTTCA 

TCTTTGCTCCGNAAACTCCAGTCACTGTTTTCNCATTAAACCTGTAAANAGCNANACGGNGAAAN 

NAGAAGAAAAAAAAAGAAAAAGGTGGGGCGGCCGCAAGCTTATTCCCTTTANTGAGGGGTAOT 

TAGCTTGGNACTGGCCGNCGTTTTACAACGNCGTC 

ATCNCCTTGNACAACATCCCCCTTTCNCCAGCTGGCGNTAATAACCNAAAAGGCCCCACCGATCG 
CCCTTCCCAACAGTTGCCNCAGCCTGAATGGNCAAATGGGACNCNCCCTGTAANCGNCCCCATTA 
AACCCCNNCGGGNTGNGGGNGGTTACCCCGCAAGCNTGACCGCTTACNCTTGCCCAGNCCC 
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RCT-8 




:taannctaannttttnacatgattatc 

3AGGCCAAGAATTCGGCACGAGGGAGAGTATGGATTCCAAAACGCCGTTCTGGTTCGATACACCC 

AGAAAGCACCTCATGTGTCGACCCCAACTCTCGTGGAGGCAGCAAGAAACCTGGGAAGAGTGGGC 

^CCAAGTGTTGTACCCTTCCTGAAGCTCAGAGACTGCCCTGTGTGNGAAGACTATCTGTCTGCCA 

rcCTGAACCGTCTGTGTGTGCTGCATGAGAAGACCCCAGTGAGCGANAAGGTCACCAAGTGCTGT 

ATTGGGTCCCTGGTGGAAAGACNGCCATGTTTCTCTCCTCTGACAGTTGACGAGACATATC 

CAAAGAGTTTAAAGCTGAGACCTTCACCTTCCACTCTGATATCTGCACACTCCCAGACAAGGAGA 

AGCAGATAAAGAANCAAACNGCTCTCGCTGAGCTGGTGAAACACAAGCCCNAGGGCCCAGAAGAT 

CANCTGAANACGGTGGATGGGTGACTTCGCAAAATTCGTGGACAAGTGGTTCCAAGGCTGTCNAC 

AAAGGATAACTGCTTCGCCCCTGAGGGGCCAAACCTTTGTTGCTAGAANCNAAAAANCCTTAACC 

TTAAACACATCACAACCATCTCAGGNTACCCTNGAGAAAAAAAANACCTTGANTAN 


o 

CO 

1 

!i 




CTT ATGAC ATGATT ACG AATTTAATACGACTC ACTATAGGG AATTTGGC CCTCGAGGC C AAGAAT 

TCGGCACGAGGGTTGTGCCTGATGGCTGACAGAAAACAAGCAGGAGAAATATACACAAGGGCTGC 

TTCTAGACTGTTCAGAGGAAGTTAGGTGGCTGACTCACCTGACCGGTAACCAACCCTGCCTTCTA 

AGTATGGCCACGAACAGATCAACACCTGCTCCTTCCTGCCATACCTCAAAGTGTTAAATACAAGG 

GAATTCAAAGCAGTTCATGTCATCATCACGTCTATCCAGAATCTTTCAATTTAA 

GGTTTTTTCAGAAGTGACATTTTGGCTTTATTTTCCTTTCTAACTC 

TGATACCTTTTACAGAATAAAAGTGAACTATGAGACACTCTCTTACCTTAGAAGTTACCAAAAAA 
AAAAAAAAAAAAAAAAC ATTGCGGC CGCAAGCTT ATTCCCTTTAGTGAGGGTTAATTTTAGCTTG 
GCACTGGCCGTCGTTTTACAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAACTTAATCGCCT 
TGCAGCACATCCCCCTTTCGCCAGCTGGCGTAATANCGAANANGNCCCCACCCGATCGCCCTTCC 
AACAGTTGCGCAACCTGAATGGCGAATGGGGACC 


RCT-83 




TATGACATGATTACCAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAAATT 
CGGCACGAGGTGATTATTGTGATGGAAATTCTC 

AAGTGGAGGAAGACAGCAATAAAATGAATCCAAGTTTAAAAATGGGAAGACATAGGGTAAGAGCT 
CTCTATTATAAAACTGATTGCATTGTGTTTCTTCTTACACATCAGTATATTGCATAGCTC 
GAAAC ATTGACG AACTTGCTAAGCTTACTTTTC AC AAACTC AAAAAATC CTTT AAAGGGC AT AGG 
AAAAAAAATCATTCAACAATAATACCCTTTTGCTTCCTTAGAGC 

AAAACAGGACACCCTTAGATTTTCCACCTATTCGTAACATGGACGCTGGAACTCACAGGAGAGCT 

GTCTGTGGTTCCACAAACATCATATTTTTTGGTGAGTGACTGTGGTTGTGA 

GTAAGGCTCTCCAGGCAGTGATGGCTACTTCAGGTATGACTACCATGAAAGACTATGTGGAATTT 

CTTTTACCTTTGGAAATGTTTGAAAAGCTAGTAAAAAGAATTAGGGAATTT 

AAAATATG _ 


RCT-84 




TNATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATT 

CGGCACGAGGACAAAAGTCACCATGGCCTACCTATGAAGAGACCAACAGGTACCGTCTTGTATGC 

ACATATATTACCCACACACGCATCACGCATATCCGCACCGTTTGTTTCCTATATACAGGCATAAA 

ATAGAGTAAGCCCAGGTAGTTTTTAAAGTACCCTTCCGTGTGACTACCGTTGTCGTTC 

TGAGAATAAAAAGTTGTTCATTATGTCTGGAAAGGAGTCGAGTTTTGTCCTGTGAGCATGTCGGG 

CTAAGAAGAACATCAGGGCTCCCACTAAGGTTATCTTCCCGCTGACAAACCGTAAGGGAGCCATC 

GGAGCTCACAACCAAATTGTTCTCTCTGGAATGAAGCTAGTGCCAGCCTGTGGCTTTCGGGCTCA 

GCAGGAGCTAGGGTAAGGNAAGTGTCTTGGTACATTTCAATGCTC 

ccccacacgcacttgcgtgcgcgcgcgcgcacacacaca^^ 
gcagtctggtttggccttgtgcttttgagtttgcc 

TTTGGTGTAGTGGAGATTACATGCGTANAGGNCCNCATATGGACTCCTCCGTTTC 
r»aai7iraMAiiMfiMranan^TTTTTMTTTNGGTTGG^^ 


RCT-87 




TTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATT 
CGGC ACG AGGC C AAACTGC AC AGAGGGGAAATC AAGCTGAATTCTG AGCTGGACTT AG ATGACTC 
CATTTTAGAGAAGTTTGCCTTCTCCAACGCTC 

CAACACTGGACAAATTTATTGAGTCTATTCAGTCCATCCCAGAGGCTTTAAAAGCTGGGAAGAAA 

GTCAAACTGTCTCATAAAGAAGTTATGCAGAAAATGGGTGAGCTCTTTGCCCTCAGGCATCGAAT 

AAACCTGAGCTCAGACTTCTTGATCACCCCTGACTTCTACTGGGACAGAGCGAACCTGGAGGAGC 

TTTACGACAAGACCTGCCAGTTCCTCAGCATCACTCGAAGAGTTAAGGTCATGAATGAAAAACTG 

CAACACTGCATGGAACTAACAGACCTAATGCGCAATCACCTCAATGAGAAGCGAGCGCTCCGCCT 

GGAGTGGATGATCGTCATCCTCATCACCATTGAGGTAATG1TCGAGCTGGGGAGAGTCTTCTTC 

GATCTTGTGATCATGAAAGCACCACTGGAAGAGAGTAGTCAAGTTCTGCAACCAAAAAACCAGCG 
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RCT-88 




itatgacatgattacgaatttaatacgactcactatagggaatttggccctcgaggccaagaatt 
c:ggcacgaggctcctgggttcccagtctgagctttccaggtattgatgcttcacaggcactgcta 

CATACTGCTGAGGGAAACATGTCCTGCGTGACTTTACTGGGAAAATTCCTAGCAAAAATTTTGTG 
CCTGATTTCCTCTGGACGTGTGCTGAATAGACTTCCACAAGAGTTGTGCACACACATTCCTGAAA 
I^CTTCTCTGCTTCCTGCCCTTGGAATCTAACTGCCTCAGCCTAGAGGAATTCTAATCATTCCC^ 
GCTCACCCAAGGACCGCTGGCCCTGTTTATCTTGTCATTACCTAGAAATTCCTTCTTATCAGAAG 
GCGCATCTCTTTCATTGTCCCTTTTCCCCTCGTCCTTCACTACTTATA^ 

AACACAGAGGGGATGAATTCACTCGAAAATTGTTGTATGAAATTCTCAAGGAGTTAAAACCATG 
TTCAAAAGAACAAAATGTGGACAGCATGCTGCCATTTGCTTAAGGAAGTGAGATAGAGAAATGC 
CCGAAGGATTTCTGGAGGCGTCCAGGCTX5CTTCTTGCTACTTCA 


RCT-89 




TTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATT 

CGGCACGAGGTGTGTTCCTCTGAAATTCTGGACTGATTTGGACAACTGAATCTGGA 

ATACACAGGGGAGTTGATTTTTTATAAGCACATGAAACAGACAGTAGATTTCACACTTTTCAGGA 

CCATAGTGTTTGTCATGCTTTAGTTATATGTGAGTGAAGTGGCAGAGGAGG 

G ATTCTGACTGTACTGG AACTG AC CTTAAATGAGTAGAGAAC ATTCTTGGC C AGAACGTGGC ACT 

GTGTAGAGGCAGCCCCAACACTACCTCTTAATTGGTTTATTACATGTAACTTTCCATATTTTCCC 

TCCCTTTTAAAGTCAGCAGTATTTTCGTTTTAAATGTCATCTTAG 

CAACAGACACCACTAAAACTAGACCATTCCGAATTCATGTTCTTTCGAGTTTTATCCATTATCCT 

taaaagtaagtgctgttacacccacattcctgattc^ 

ggtttcctgtagtccagactggcctcatattcactgcatagctgaagatcaccttgaacttctga 


RCT-91 




TCTATGACATGATTACGAATTTAATACGAATCACTATAGGGAATTTGGCCCTCGAGGCCAAGAAT 
TCGGCACGAGGCTTTGCTC(^GCATGGCTGCCTTAGGGACCTGGCTATCCATAGGTGTCCGGA^ 
TTGCACAGTAGTGCAGTGGCGCGGGCCGGCAGCCAGTGGCGACTCCAGCAAGGGCTGGCTGCCAA 
TCCTTCCGGCTATGGGCCCCTCACGGAGCTTCCTGACTGGTCTTTCGCGGATGGCCGCCCTGCTC 
CCCCAATGAAAGGCCAACTTCGAAGAGAAGCTCAAAGGGAGAAGCTTGCAAGACGAGTTGTGCTG 
CTGACACAGGAAATGGATGCAGGATTACAGGCATGGAAGCTCAGGCAGCAGAAACTGCAGGAAGA 
AAGGAAGC^GAAACATGACCTTAAACCTAAAGGGACGTCACTGAGGAACCCACTTCCAAATCAAT 
AAAAAGCTCCTGTCCCACTTTCAAAAAAAAAAAAAAAAAAAAAAA7VAAGTGAGCGGCCGCAAGCT 
TATTCCCTTTAGTGAGGGTTAATTTTAGCTTGGCACTGGCCGTC 

GGAAAACCCTGGCGTTACCCAACTTAATCGCCTTGCAGCACATCCCCCTTTCCCAGCTGGCGTAA 
TAGCGAAAA 


RCT-92 




TATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC 

GGCACGAGGTGCCTACCCGGTCTCATTGTTCATGACTGCATATCATTAGCGCGCTTC 

TCATATTTTCGAAGATCTGGGGGGTTTTTTTCTATATCGCAGGATATTTTTGTACATC 

ACCTCATTCAGTTGATAATCCCAACATTGTTTGTCATCCTTAAATCATGAGAGTAAACCCAAGTA 

TGACAAATTAAAAGAAAACTCTAGTCTTTCTAAATTTGTCTTGTCTAAAAAATATGTTCCCA 

ACTGCCAGCATTGCTCTCACCTAAGGACGAACCACTCCTCCTCATTCCTTGTCTTCAACTCATGC 

ATTTGTAAATGATGCTGGCAACCTACATGAACAGACAACATTGTCTCCTTGCCTCTGGACAGCCT 

TACCAGCTGGTCTCATCTTCCTGCATGGCCACACCCCTAGTGATGGAACTCAGGTANCATAGCAC 

AACGTGAAGTTGTAGTCTGTTGAGCCTCCCATACCAATGAGAAAAGAAGCTTTGGAATTC 

TTCTGAGATTCTGGTAGTACCTTCATATTTCATGTTGTAGACATTTGAAACTG 


RCT-94 




TTCTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAA 
TTCGGCACGAGGAAACAGTTTATTTTAATATT^^ 

GTAGGAGTTCCGGGTCTCAATGTATATTCCGTGGTGTATATGATTCAGGAAAGAAAAATTAATCA 

AGAAACACATGGTTCTGTGACAGAGAGAGCTGTTTTCTGGGAGGTGTC^ 

TGGATCATGGTTCTTATTTATATTTCTGCATTCCATCTAGTTTCCTACATTC 

GTCCCCCAACTAATAAACTGACTGAAATCAAGAGAGCACCCCAAAGTTATTTGTAAATAGTTACG 

TGGGAAAAAAAAAGACATATTAAACTTGAAGATAAAATGTGNCAAACAANAAANAAAAAAAAAAA 

AACCCATGGGGCCGCAAGCTTATTCCCTTTAGTGAGGGTTAATTTTAGCTTGGCACTGGCCGTCG 

TTTTACAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAACTTAATCGCCTTGCAGCACATCCC 

CCTTTCCCCAGCTGGCGTAATACCGAANAGGCCCGCACCGATCGCCCTTTCCCAACAGTTGCGCA 
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RCT-99 




TATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC 
GGCACGAGGGATCTCAGTATTTAAACTGTTCCTCAATTTTGTGAGGCTG 

CCTCTGATGCTGTTGGTATGCAAGGCAGCGGTGCTTACACAATATTTCCTGTGCTCTCCAGAGAC 

GATGGACTGATTTCCTGACACTACTCTCCC1TCACTTCCGTGGTTACCTTGAGTCTTGA 

AGTGCCCACGATGGGTGTAGCCTTTATTAAACAGATCGTGTATTCTGATCTCTCGCTGCAGCCAC 

AGTGCAGCTCCCTATAAACCTGCAGCCCAAACCATTTGTATCAGGCATCACCTACTAACACAGAC 

GTGCGCGGCTTTTCTGCATCAATTGCTGTGACGGTTCAGAATGTTGGTATACAAGAAGGAATAGA 

TGTTTGAAATAATTCTAGTACAAAGTATAATAAAACTAGATGTATAATAAACCCTTTAAATCATT 

GCTAAGTGTATAAGTGGGAACTGAAGCATTTATTGGACAAAGTAATGTO 

GCTCGCTCCTGCGTGGGCACACTGGTTATNATTT 


Phosphatidyl thanolamine- 
binding protein 


CN 

r- 
t-t 

S 


TATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC 
GGCACGAGGCTGAAGTGGGAGCGGCCCCAGCACGCCCTGAGGGTCGACTACGGCGGAGTAACGGT 
GGACGAGCTGGGCAAAGTGCTGACGCCCACCCAGGTCATGAATAGACCAAGCAGCATTTCATGGG 
ATGGCCTTGATCCTGGGAAGCTCTACACCCTGGTCCTCACAGACCCCGATGCTCCCAGCAGGAAG 
GACCCCAAATTCAGGGAGTGGCACCACTTCCTGGTGGTCAACATGAAGGGCAACGACATTAGCAG 
TGGCACTGTCCTCTCCGAATACGTGGGCTCCGGACCTCCCAAAGACAC^GGTCTGCACCGCTACG 
TCTGGCTGGTGTATGAGCAGGAGCAGCCTCTGAACTGTGACGAGCCCATCCTCAGCAACAAGTCT 
GGAGACAACCGCGGCAAGTTCAAGGTGGAGTCCTTCCGCAAGAAGTACCACCTGGGAGCCCCGGT 
GGCCGGCACGTGCTTCCAGGCAGAGTGGGATGACTCTGTGCCCAAGCTGCATGATCAGCTGGCTG 
GGAAGTGGGGGCGCTGCAGACCCGCAGCCCCGGGGACCCCACAGTACAGTCAAGTCGTATTAAAG 
CATGTGCTTGTGGGGTGTCCCCCCACGNCCCATCCT 


Phosphoglycerate 
kinase 


M31788 | 


TTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCGCCCTT 

ATCGCGGGATCCTGTTGGGGTATTTGAATGGGAAGCCTTTGCCAGGGGAACCAAGTCCCTCATG^ 

ATGAGGTGGTGAAAGCCACGTCTAGGGGCTGCATCACTATCATAGGAGGCGGAGACACCGCCACT 

TGCTGTGCCAAATGGAACACAGAGGATAAAGTCAGCCATGTGAGCACTGGGGGCGGCGCCAGTCT 

AGAGCTCCTGGAAGGTAAAGTCCTTCCTGGGGTGGATGCTCTCAGCAATGTTTAGTATTO 

CCTTTGGTTCCTGTGCACAGCCCCTAAGTCGACCTAGTGTTTTCCGCATCTCCATTTGGT 

TGCAGCTAGTGGCCAAGACGCAGCACCAGGAACCCTAAGCAGCTGCACAGCATCTCAGCTCGTCT 

TTACTGCATCGGGATTCATCTACTACGTTCAAGATCCCATTTAAATTCCTTAGCGACTAAAACCA 

TTGTGCATTGTAGAGGGCATCTATTTATACTCTGCC 

AAAAGGGCGAATTCCAGCACACTGGCGGCCGTTACTAGTGGATCCGAGCTCGGTACCAAGCTTGA 


Poly(ADP-ribose) 
polymerase 


U94340 J 


GTTATTTAGGTCNCACTATAGAATACTCAAGTTATGCATCAAGTTGGTACCGAGCTCGGATCCAC 

TAGTACCGGCCGCCAGTGTGCTGGAATTCGCCCTTCGCGGGATCCGCACAATGCCTATGACCTGG 

AAGTGATAGACATCTTTAAGATAGAGCGAGAGGGAGAGAGCCAACGCTACAAGCCCTTCAGGCAG 

CTTCACAACCGGAGACTGCTGTGGCACGGGTCCAGGACCACCAACTTCGCAGGCATCCTGTCACA 

GGGTCTGCGGATAGCCCCACCTGAAGCACCTGTGACAGGCTACATGTTTGGGAAAGGAATCTACT 

TTGCTGATATGGTGTCCAAAAGTGCGAACTACTGCCACACGTCTCAGGGAGACCCGATTGGCTTA 

ATACTGTTGCX5AGAAGTTGCCCTTGGAAACATGTACGAGCTCAAGCATGCTTCTCACATCAGCAA 

GTTACCCAAGGGCAAGCACAGTGTCAAAGGTTTGGGCAAAACCGCCCCTGACCCTTCGGCCAGCA 

TCACCCTGGATGGTGTAGAGGTTCCGCTGGGAACAGGGATTCCGTCTGGTGTTAATGACACCTC 

CTGCTGTATAACGAGAAGCTTGGCCAAGGGCGAATTCTGCAGATATCCATCACACTGGCGGCCGC 

TCGAGCATGCATCTAGAGGGCCCAATT 


Pr epr oa Ibumin 


1 V01222 1 


T ATG AC ATG ATTACGAATTTAATACGACTC AC TATAGGG AATTTGGC C C TCG AGGC C AAG AATTC 
GGCACGAGGCCCCACTAGCCTCTGGCACAATGAAGTGGGTAACCTTTCTCCTCCTCCTCTTCATC 
TCCGGTTCTGCCTTTTCCAGGGGTGTGTTTCGCCGAGAAGCACACAAGAGTGAGATCGCCCATCG 
GTTTAAGGACTTAGGAGAACAGCATTTCAAAGGCCTAGTCCTGATTGCCTTTTCCCAGTATCTCC 
AGAAATGCCCATATGAAGAGCATATCAAATTGGTGCAGGAAGTAACAGACTTTGCAAAAACATGT 
GTCGCTGATGAGAATGCCGAAAACTGTGACAAGTCCATTCACACTCTCTTCGGAGACAAGTTATG 
CGCCATTCCAAAGCTTCGTGACAACTACGGTGAACTGGCTGACTGCTGTGCAAAACAAGAGCCCG 
AAAGAAACGAGTGTTTCCTGCAGCACAAGGATGACAACCCCAACCTGCCACCCTTCCAGAGGCCG 
GAGGCTGAGGCCATGTGCACCTCCTTCCAGGAGAACCCTACCAGCTTTCTGGGACACTATTTGCA 
TGAAGTTGCCAGGAGACATCCTTATTTCTATGCCCCAGAACTCCTTTACTATGCTGAGAAATACA 
ATGAGGTTCTGACCCAGTGCTGCCAGAGTCTGACAA 
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i 



JJ ^ 



8-g 



TTCTGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATT 

CGGCACGAGGTCTGTCTGCCATCCTGAACCGTCTGTGTGTGCTGCATGAGAAGACCCCAGTGAGC 

GAGAAGGTCACCAAGTGCTGTAGTGGGTCCCTGGTGGAAAGACGGCCATGTTTCTCTGCTCTGAC 

AGTTGACGAGACATATGTCCCCAAAGAGTTTAAAGCTGAGACCTTCACCTTCCACTCTGATATCT 

GCACACTCCCAGACAAGGAGAAGCAGATAAAGAAGCAAACGGCTCTCGCTGAGCTGGTGAAACAC 

AAGCCCAAGGCCACAGAAGATCAGCTGAAGACGGTGATGGGTGACTTCGCACAATTCGTGGACAA 

GTGTTGCAAGGCTGCCGACAAGGATAACTGCTTCGCCACTGAGGGGCCAAACCTTGTTGCTAGAA 

GCAAAGAAGCCTTAGCCTAAACACATCACAACCATCTCAGGCTACCCTGAGAAAAAAAGACATGA 

AGACTCAGGACTCATCTCTTCTGTTGGTGTAAAACCAACACCCTAAGGAACACAAATTTCTTTO 

ACATTTGACTTCTTTTCTCTGTGCCGCAATTAATTAAAAATAGAAAG^ 

NNNNNNNNNNNNNNAAAGTGTGGCGGNCGCNAGCTTATTC 

GGCACT 



O) 8 
C D) 

as 

o 

^ o 

OH 
U U 

ft 



OTCAAGCTATGCATCAAGCTTGGTACCGAGCTCGGATCCACTTAGAAACGGCCCGCCAGTGTGCT 
GGAATTCGCCCTTCGCGGGATCCGGGGCTGAAGATAATGCGTGATACCTT 

AGCACCAAATCAAGAGAAAGTTTCAGACTATGAGATGAAGTTAATGGACTTAGACGTTGAGCAAC 
TTGGAATCCCAGAACAGGAGTACAGCTGCGTAGTAAAGATGCCATCTGGTGAATTTGCACGTATA 
TGCCGGGACCTTAGCCATATTGGAGATGCTGTGGTGACCTCCTGTGCAAAGGACGGGGTGAAGTT 
TTCTGCGAGTGGGGAGCTTGGCAATGGGAACATTAAGTTGTCCCAGACAAGCAATGTTC 
AAGAGGAAGCTGTGTCCATAGAGATGAATGAGCCAGTTCAGCTAACTTTTGCTCTGAGGTACCTG 
AACTTTTTCACAAAAGCCACTCCACTGTCTCCTGCAGTAACACTCAGTATGTCTGCAGA 
CCTTGTTGTAGAGTATAAAATTGCTGACATGGGACACTTAAAGTATTATTTGGCTCCC 
AAGATAAGCTTGGCCAAGGGCGAATTCTGCAGATATCCATCACACTGGCGGCCGCTCGAGCATGC 
ATCTAGAGGGCCGATTC 



c 

Is 

to e 



TGCGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTC 
GCCCTTCGCGGGATCCGTGGACTACGGTGTCGAGGCACTGGTGGATGCCTTCTCACGCCAGAGGG 
CTGGCCGGATTGGTGGGGGTAGGAACTTTGACTACCATGTTCTGCATGTGGCCGAGGA 
AAGGAGTCCCGAGAAATGCGCCTGCAGTCCTTCAATGAATACCGAAAGAGGTTTGGCCTGAAGCC 
TTACACTTCTTTCCAGGAGTTCACAGGAGAGAAGGAGATGGCCGCTGAGTTGGAGGAGCTATATC 
GTGACATCGATGCTTTAGAGTTCTACCCGGGGCTGATGCTGGAGAAGTGCCAGCCCAACTCCCTC 
TTTGGGGAGAGCATGATAGAGATGGGGGCTCCTITCTCCCTCAAGGGCCTCCTAGGGAATCCCAT 
CTG1TCCCCAGAGTACTGGAAACCCAGCACATTCGGTGGTGATGTGGGTTTCAACATC 
CAGCCTCACTGAAGAAACTGGTCTGCCTCAACACCAAGACCTGCCCCTATGTCTNCTTCCGTGTG 
CCAGATAAGCTTGGCCAAGGGCGAATTCCAGCACACTGGCGGCCGNTACTAGTGGATCCGAGCTC 
GGTACCAACTTGATGCATAGCTTGA 



CO CM 

8 U 
U O 
O U 

&9 



GAATCGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCGCCC 

TTGGGACGAAGACGACAAAGGTCCTCCCTGTGGTCCAGTGAACTGCAATGAGAAGATTGTGGTCC 

TC CTGC AACGCCTAAAGC CCGAGATC AAGGATGTCATTGAGCAACTC AAC C TGGTTACTACCTGG 

TTGCAGCTACAGATACCTCGGATAGAGGATGGGAATAATTTTGGCGTGGCTGTC 

GTTTGAGCTGATGACCAGCCTTCATACCAAGCTGGAAGGCTTCCAAACGCAGATCTCTAAGTACT 

TCTCCGAGAGGGGTGATGCCGTGGCCAAAGCAGCCAAGCAGCCTCATGTGGGTGATTATCGGCAG 

CTGGTGCATGAGCTGGACGAGGCGGAATACCAGGAGATCCGGCTGATGGTCATGGAGATCCGTAA 

CGCTTATGCTGTGTTATATGACATCATCCTGAAGAACTTTGAGAAGCT 

AGACAAAGGGGATGATCAAGGGCGAATTCCAGCACACTGGCGGCCGGTACTAGTGGATCCCNAGC 
TCGGT ACC AAGC CTTG ATGNAT AGC TTNG AGTATTCTATT AGTGTC AC CCTAAATAGNTTTGGCN 



Hi 

3 *i * 



i 



TATG AC ATGATTACG AATTTAATACGACTC ACTATAGGG AATTTGGC CCTCGAGGC C AAGAATTC 
GGCACGAGGGTTTGCCGTCAGAACGCAGGTGTTGTGAAAGCCACCGCTAATTCAAAGCAAAAATG 
GGAAAGGAAAAGACTCACATCAACATTGCCTATTGGCTGCACCCCAGGACCAGTGCCCAGATCCA 
CTTGCTTGGAAACATCGTGATCTGGACTTCAGCCAGCCTCGCCACAGTGGCATACACCCTACTCT 
TCTTCTGGTACCTGCTCCGCCGTCGAAGGAACATCTGTGACCTCCCTGAGGATGCCTGGTCCCAC 
TGGGTGCTGGCTGGAGCTCCAGGAATGAATTCCAATTTCATCTCAAGA 

TTCTCCTCACACAGTGAAGAATGTGCCCAGCCACAGCATCACCCATGAGGCCCAACTCTGACCAG 
TGTTTGAGCTGCCAGTGTAGGACTCACCTACACTACACCTAAGGCAGGAGGAGCAGCCAGTGAAG 
GAGTGAAGTCCAGGCCCCGCCCAGCTGTGCGCCCACCAATGGGGTCTTAGCTCTCTCCCGANGCC 
CACAGTACTGCCACTCATTTGTGTGCAGGTACAGTGGCCCTCTGTAAAGCCTGCTTTGAAANCTG 
CCTTCACTCACACTGACTCCTCACCAATGCGACTCTANAAATC 
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Protein tyrosine 
phosphatase alpha 


L01702 1 


^TATGCATCAAGTTGGTCCGAGCTCGGATCCACTAGTAACGGCCGCCAGTGTGCTGGAATTCGC 
^ ^ATCGTCGG ATCCTGCAG AGGCC ACAC ATGGTC C AG AC ACTGG AACAGT ATG AATTC TGCTA 
^AAGGTGGTACAGGAGTACATCGATGCCTTCTCAGATTATGCCAACTTCAAGTGACAGGGGACAA 
^CCCACAGACAGGAGAATTGCCTTTAATATTTTGTAATATTCTGTTTTC 

ttctatatatctcataactgt^ 

TTTGTATGTAAATGTGTTAGCACTATAGTCCTTTTCCAGTGTTTTATTGGGG 
GATATTTGAGTTGATTTAATGAAGTCCTTAGCCTGGAAATTG^ 

ATGTCTTTTTCCAAAGAAGACAAACATAAGAGTCATTCCAGGTAACTCGGTGCCAACTAAAACAA 
AGCACAAAGTTCTCGGAGCTCTTGAGGAAATGGTTGTCTCACCGTCCCCAGGCCGGCCTCTTCCC 
TTCCCTGAAGCTTGGCCAAAAGGGCGAATTCTGCAGATATCCATCACACTGGCGGCCGCTCGAGC 


PTEN/MMACl 


AF017185 j 


GCGAATTGGGCCCTCTAGATCCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCG 

CCCTTATCGCGGGATCCCTATTCCAATGTTCAGTGGCGGAACTTGCAATCCCCAGTTTGTGG 

GCCAGCTAAAGGTGAGGATCTACTCCTCCAACTCAGGACCCACGCGGCGGGAGGACAAGCTCATG 

TACTTTGAGTTCCCTCAGCCATTGCCTGTGTGTGGTGACATCAAAGTAGAGTTC 

GAACAAGATGCTCAAAAAGGACAAAATGTTTCACTTTTGGGTAAATACGTTCTTCATACCAGGAC 

CAGAGGAAACCTCAGAAAAAGTGGAAAATGGAAGTCTTTGTGATCAGGAAATCGATAGCATTTGT 

AGTATAGAGCGTGCGGATAATGACAAGGAGTATCTCGTGCTCACCCTGACAAAAAATGATCTTGA 

C AAAGC AAAC AAAGACAAGGCC AACCGATACTTCTCTC C AAATTTT AAGGTGAAATTATAC TTTA 

CAAAAACAGTAGAGGAACCATCAAATCCAGAGGCTACAAAGCTTGGCAAGGGCGAATTCCAGCAC 

ACTGGCGGCCGTTACTAGTGGATCCGAGCTCGGTACCAAGCTTGATGCATAGCTTGAGTATTCTA 

m *Mm^m^»n^m^ 7k iv m tv nnrrj-mn ntiT a a TT* A r rnci r VC AT ACJCTGGTTC CTG 


Pyruvate kinase, 
muscle 


M24361 | 


TATCACATGATTACGAATTTAATACGACTC^ 

GGCACGAGGCGCATGCAGCACCTGATTGCCCGAGAGGCAGAGGCTGCCATCTACCACTTGCAGTT 
ATTCGAGGAACTCCGCCGCCTGGCGCCCATTACCAGCGACCCCACAGAAGCTGCCGCCGTGGGTG 
CCGTGGAGGCCTCCTTCAAGTGCTGCAGTGGGGCCATTATCGTGCTCACCAAGTCTGGCAGGAGT 
GCTCACCAAGTGGCCCGGTACCGCCCAAGGGCTCCTATCATTGCTGTGACACGCAATCCCCAGAC 

AGCCCGCCAGGCCCATCTGTACCGTGGCATCTTCCC 
CCTGGGCTGAGGACGTTGATCTTCGTGTGAACTTGGCCATC 

TTCAAGAAAGGAGATGTGGTCATTGTGCTGACTGGATGGCGCCCTGGCTCTGGCTTC 

CATGCGTGTANTGCCTGTACCATGATGATCCTCTGGAGCTTCTCTTCTAACC 

CTCCCCTATNCTATTCATTAAGGCACAACGCTTGTAGTGCTCACTCTGGGNCATAATGTGGCGCT 

~,n\/-*/-^r^/->rr^r*r* * /"m /"«/"^ A rf^/" 1 7v TV 7V JV A TAT* ^ 7A r TYTMf ,r Pf TY^ A A 




U12187 1 


GNGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATC 
CCCTTCGCGGGATCCCAGGGCACACATATGACCGTTCTATCAC 
CTAATGGTCTATGACATTTGGGAACAGGATGCAGGCTC^ 
GGGGGATGCATATGTCATTGTGTACTCAATAACGGACAAGGGCAGCTTTC 

TCCGGGTCCAGCTGCGGCGGGCACGGCAGACAGACAATGTGCCCATCATCCTAGTGGGCAACAAG 
AGCGATCTGGTGCGCTCTCGTGAGGTCTCTGTGGCTGAGGGCCGGGCCTGCGCAGTGGTCTTCGA 
CTGCAAGTTCATCGAGACCTCCGCAGCACTGCATC^CAACGTCCAGGCACTGTTCGAGGGTGTCG 
TGCGTCAGATACGCCTGCGCAGGGACAGCAAAGAGGATAATGCTCGTCGACAAGCTGGCACTCGA 
CGACGGGAGAGCCTTGGCAAGAAGGCCAAACGCTTTCTGGGCCGCATAGTAGCTCGCAACAAGCT 
TGGCCAAGGGCGAATTCCAACACACTGCGGCCGTTACTAGTGGATCCGAGCTCGGTACCAAGCTT 

GATGCATAGCTTGAGTATTCTATAG 


Ref-1 


1 D44495 1 


GNATNTCCTCCACNANAATANAANGATGNANCOTAANNANAATTTAGCAA^ 
NTNGANNNNNNANANNNNNTTCCAGTGNNGAT^ 

CNGGGATCCCGAAANAACCCAAGTCCGAANCCAGAAGACCAAAANNANTAAGGGGNCANCAAAAN 

AAAATTTAGAAGGAGNCCNCAAGGAAAAGGGCCCTTGTCNTGTA 

GAAAACGTCAGCCAGTGGCAAAATCTTGCCCACACTCAAAAATATGCTCC 

CTTCGAGCCTGGATTAAAAAGAAAGGCTTGGATTGGGTAAAGGAAGAAGCACCA 

CCTCCAAGAGACCAAATGCTCAGAGAACAGACTTCCCGGCTGAACTGCAAGAGCGTGCCTOTACT 

CACCCATCAGTACTGGTCAGCCCCATCAGACAAAGAAGGATATAGTGGTGTGGGCCTACTTTCCC 

GCCAATGCCGCTCAAAGTCTCTTATGGCATTGGTGAGGAAGAACATGATCAAGAAGGCC 

TTCTGGCTGAATTTGAGTCCTTTATCTTGGTAACA^ 

GTAAGACTGGAGTACCGACAGCGATGGGATGAAGCCTTCAGAAAGTTTCTAAAGGACTTG 
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la 
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u 
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TGCGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTC 
GCCCTTCGCGGATCCATCCAGGTGATTTTCGGTGCCGTGGACCTGCCTGCCAAGTTTGTATC 
CCTAGTCATCAACTCCATGGGGCGCCGGCCTGCACAGATGGCCTCCCTGTTGCTGGCAGGCATCT 
GCATCCTGGTGAATGGCATAATACCGAAGAGCCATACGATCATTCGCACCTCCCTAGCTGTGCTA 
GGGAAGGGCTGCCTGGCTTCCTCTTTCAACTGCATCTTCCTGTACACCGGAGAGCTGTACCCCAC 
AGTGATTCGGCAGACAGGCCTGGGCATGGGCAGCACCATGGCCCGGGTGGGCAGCATTGTGAGCC 
CGCTGGTGAGTATGACTGCAGAGTTCTACCCCTCCATGCCTCTCTTCATCTTCGGCGCTGTCCCT 
GTGGTCGCCAGTGCTGTCACTGCCCTGCTGCCAGAGACCTTGGGCCAGCCGCTGCCAGATACAGT 
ICAGGACCTGAAGAGCAGGAGCAGAGGAAAGCAGAATCAACAGCAGCAGGAACAGCAGAAGCAAG 
ITTGGCCAAGGGCGAATTCCAGCACACTGGCGGGCCGTTACTAGTGGATCCGAGCTCGGTACCAA 

GCTTGATGCATAGCTTG 



GCGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCG 
CCCTTCGCGGAATTCCTTGTGGAGTGGGCCAAGAGGATCCCACACTTTTCTC^ 
CGACCAGGTCATCCTCCTCCGGGCAGGCTGGAACGAGCTGCTGATTGCCTCCTTCTCCCACCGCT 
CCATAGCTGTGAAAGACGGCATCCTCCTGGCCACCGGCCTGCACGTACACAGGAACAGCGCTCAC 
AGTGCTGGGGTGGGCGCCATCTTTGACAGGGTGCTAACGGAGCTGGTGTCGAAGATGCGTGACAT 
GCAGATGGACAAGACGGAGCTGGGCTGCTTGCGCGCCATTGTCCTCTTCAACCCTGACTCTAAGG 
GGCTCTCCAACCCTGCTGAGGTGGAGGCGCTGAGGGAGAAGGTGTATGCATCACTAGAAGCGTAC 
TGCAAACACAAGTACCCTGAGCAGCCGGGCAGGTTTGCCAAGCTGCTGCTCCGGJCTGCCTGCACT 
GCGATCCATTGGGCTCAAGTGCCTGGACACCTGTTCTTC 

C^GGGCGAATTCCAGCACACTGGCGGGCGTTACTAGTGGATCCGAGCTCGGTACCAAACTTGAT 
GCATACTTGAGTA 



-4 • 

lit 



GNNNNfcTOCTATGACATGATTACGAATTTAATACGAC 

AAGAATTCGGCACGAGGGCCTTCTCAGACTCCCTCAGGAGGGAGCTCACCTACTTTGGGGTGAAG 
GTGGCTATTATAGAGCCTGGTGGGTTCAAGACCAATGTCACTAATATGGAGAGGCTATCAGACAA 
CCTGAAGAAGCTGTGGGACCAGGCCACTGAGGAGGTCAAGGAGATCTACGGCGAGAAGTTTCGGG 
ACTCCTATATGAAAGCAATGGAGTCACTGGTGAACATGTGCTCAGGGGACCTGTCTCTGGTAACC 
GACTGCATGGAGCACGCCCTGACTTCCTGTCACCCTCGCACCCGGTACTCAGCTGGTTGGGATTC 
CAAGTTCTTCTACCTCCCCATGAGCTACCTTCCCACCTTTCTTTCGGATGCCGTAATC 
GCTC TGTAAAGC CTGCCCG AGC CCTGTGAATCTGC AC ATGTGTGCAGACTTGGGGAAGTAAGGCG 
GGTGGAGGGAGATAACAATGTGGGGTCCATTGTTCACCATACTCATTAAAATAATTCTGCTTC 
TACTAAAAAAAAAAAAAAAAAAAAAAAAAAAAGTGTGGCGGCCGCAAGCTTATT 



4« 

i-f 

O 0) 
c *> 

•rl O 
K 



TATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC 
GGCACGAGGCGTTTCTCTGGGCTCTGGTATGCCATCGCCAAAAAGGATCC^ 
GCAAGACAACATCATCGCTGAGTTTTCTGTGGACGAGAAGGGTCATATGAGCGCTACAGCCAAGG 
GACGAGTCCGTCTTCTGAGCAACTGGGAAGTGTGTGCAGACATGGTGGGCACTTTCACAGACACA 
GAAGATCCTGCCAAGTTCAAGATGAAGTACTGGGGTGTAGCCTCCTTTCTCCAGCGAGGAAACGA 
TGACCACTGGATCATCGATACGGACTACGACACCTTCGCTCTGCAGTACTCCTGCCGCCTGCAGA 
ATCTGGATGGCACCTGTGCAGACAGCTAGTCCTTTGTGTTTTCTCGTGACCCC 
CCGGAGACACGGAGGCTGGTGAGGCAGCGACAGGAGGAGCTGTGCCTAGAGAGGCAGTACAGATG 
GATCGAGCACAATGGTTACTGTCAAAGCAGACCCTCAAGAAACAGTTTGTAGCAATGTCAAGGAT 
GTATAAAGTTGGAAAACTTCTGATTAGCTCTCATCCAGTCTTCA 



2 



GGGAATTGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCGC 
CCTTCGCGGAATTCCATTGTGGCCAAGCAGGTACTGCTGGGCCGAAAGGTGGTGGTTC 
TGAGGGCATCAACATTTCTGGAAATTTCTACAGAAACAAGTTAAAGTATCTGGCCTTTCTC 
AGCGGATGAACACCAACCCGTCTCGAGGCCCCTACCACTTCCGAGCCCCAAGCCGCATTTTTTGG 
CGCACTGTGCGAGGCATGCTGCCGCACAAGACCAAAAGAGGCCAGGCTGCCCTGGAACGCCTCAA 
GGTGTTGGATGGGATCCCTCCACCCTATGACAAGAAAAAGCGGATGGTGGTCCCTGTTGCCCTCA 
AGGTTGTGCGGCTGAAGCCTACCAGAAAGTTTGCTTACCTGGGGC 

TGGAAGTACCAGGCAGTGACAGCTACTCTGGAGGAGAAACGGAAGGAAAAGGCAAAGATCCATTA 
CCGGAAGAAGAAGCAGCTCTTGAGGCTAAGGAAACAGGCAGAAAAGAATGTGGAGAAAAGCTGGC 
CAAGGGCGAATTCCAGCACACTGGCGGCCGNTACTAGTGGATCCGAGCTCGGTACCAACTTGATG 

CATAGCTTGAGTATT 



o 



amOTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATT^ 
ATTCGGCACGAGGGCCTGCTGCTCGCTGTCGAAATGGGCAAGTTTATGAAACCCGGGAAAGTGGT 
GCTGGTCCTGGCTGGACGCTACTCCGGACGCAAAGCCGTCATCGTAAAGAACATTGATGATGGCA 
CCTCCGACCGCCCTTACAGCCATGCCCTCGTGGCTGGAATTGACCGCTATCCCAGAAAAGTGACA 
GCTGCCATGGGCAAGAAGAAGATCGCCAAGCGATCCAAGATCAAGTCCTTTGTGAAAGTTTATAA 
CTACAACCACCTCATGCCCACAAGGTACTCTGTGGATATCCCCTTGGACAAAACTGTTGTCAACA 
AGGATGTGTTCAGAGACCCAGCACTGAAACGCAAGGCCAGGCGGGAGGCCAAGGTCAAGTTTGAG 
GAGCGATACAAGACAGGGAAGAACAAATGGTTTTTCCAGAAGCTTCGCTTTTAGATGTAW 
TTTTCGTCATTACAAAAATAAAAAATANTAAAAAACAAAAAAAAAAAAAAATCTCG 
GCTTATTCCCTTTAGTGAGGGTTAATTTTAGCTTGGCACTGGCCGTC 
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8 



TCTATGACATGATTACGAATTTAATACGACTCACTATAGGGAAT1TGGCCCTCGAGGCCAAGAAT 
TCGGCACGAGGGAAGCTTTGCGCTTCCTCTTTCCAGCCAGCGCCGAGCGATGGGCATCTCTCGGG 
ACAACTGGCACAAGCGCCGCAAGACCGGGGGTAAGAGAAAACCCTACCACAAGAAGCGGAAGTAT 
GAGCTGGGACGGCCGGCCGCCAACACTAAGATTGGCCCTCGCCGCATACATACAGTCCGAGTTCG 
AGGAGGCAATAAGAAGTATCGTGCTCTGAGATTGGATGTGGGGAACTTTTCCTGGGGCTCAGAGT 
GTTGTACTCGCAAAACAAGGATCATTGATGTTGTCTACAATGCATCCAATAACGAGCTTGTCCGC 
ACCAAGACCCTGGTGAAGAACTGCATTGTGCTTATTGACAGCACACCGTACCGACAGTGGTACGA 
GTCCCACTATGCACTGCCCCTGGGCCGCAAGAAGGGGGCCAAGCTGACTCCTGAGGAGGAAGAGA 
TTTTAAACAAAAAAC GATC AAAGAAAATTCAG AAG AAATATGATGAAAGGAAAAAGAATGC C AAA 
ATCAGCAGTCTTCTGGAGGAGCAGTTCCAGCANGGCAAGCTTTCTCGCCTGTATTGCCTCAAGAA 
CAGGCCAGTGTGGCAGANCAGATGGCTATGTGCTCNAANGCAANGAGCTGGAGT 



W 



AGCTATTTAGGTGNCACTATAGAATACTCAAGCTATGCATCAAGTTTGGTACCGAGCTCGGATCC 
ACTAGTAACGGCCGCCAGTGTGTTGGAATTCGCCCTTCGCGGAATTCTATGTGACCCCACGGAGA 
CCCCTCGAGAAATCGCGTCTCGACCAGGAGCTAAAGTTGATTGGAGAGTATGGGCTCCGGAACAA 
ACGTGAGGTGTGGAGGGTCAAATTTACCCTGGCGAAGATTCGTAAGGCTGCCCGGGAGCTGTTGA 
CGCTGGACGAGAAGGATCCTCGGCGTCTGTTTGAAGGCAACGCTCTGCTGAGACGACTTGTTCGA 
ATTGGGGTGCTGGATGAGGGC AAGATGGAG CTGGATTAC ATTCTGGGCCTG AAGATTG AGGATTT 
CTTGGAGAGAAGGCTGCAGACCCAGGTCTTTAAGCTGGGCCTGGCCAAATCTATTCACCATGCCC 
GTGTGCTCATCCGCCAACGTCACATCAGGGTCCGCAAGCAGGTGGTGAACATTCCATCTTTCATT 
GTTCGCCTGGACTCTCAGAAGCACATTGACTTCTC^ 

AGGACGAGTGAAGAGGAAAGCTTGGCCAAGGGCGAATTCTGCAGATATCCATCACACTGGCGGCC 
GCTCGAGCATGCATCTAGAGGGCCCAATTCACAA 



v 

III© 



GGAATAGGCCCNCNNGATGATGCTCGAGCGGCCNCCAGTGTGATGGATATCTGCAGAATTCGCCC 
TTTGCAGAAATGTAAGGGTGTTCGGGTGCGTGCATGTGCGTTC 

TCTGCATGACTGAGCTTGGGGAAAGAGAAATAGAACAGCCCCCAGCTCACTGTGTGATGTGGAGG 
AAATGTGTATTACAAGTGGGGTTTTAGCTGTTGAGTCAAAATAATAACAAGTGTACAAT^ 
TAAGGAATCGGAGAGCCTCTCCAGAGAAGTCGGTTTCTTTGCTGCAAGAAGAATGAGGTTCTGAA 
CCCTTATCCAAGAACAGAAGCCATCAGCCAAGTCTCCACATTTCTCTGCAAAATGTTGTAGCCTC 
TATAACTGNATG AT AGTGNAATGC ATGC CTTC AGTTGTAAGTGGC C CAGATCG CGCTTC AAN 



o 

M 

& 

C 



TTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATT 
CGGCACGAGGGGCAAGTGAAGGCTTGCAACTTTCACTTGCCCAGAGGAAGCTCTGACGAAGGGGA 
TGCATAAACCAGCTCCTGTGTAAGTTATCTGAGGAGTCTGGGGCAGCTACCAGTAGCTGCTGCTG 
CCACTGCCGACACCTCATATTTGAGAAGTCAGGATCTGCAATCACTTGACAGTGTGCCGAAAACC 
TCCCATCCTTGTGTAGCTGACAGGGGCTTTTCGCGGAGGAGAAAGTCATTGAATCCTC 
AGATCACCTCCAGCTGCCTGACACAGTCAGCATGTAAGCCCCACAGAAGCCAGCCCCAACTGAAG 
CTGAAATAATAAGACCAAGAAGTGAAAATGAAATTTGAACTAAATATTTAAAATAAAGCGTACTC 
TCCCC AACTCC ATC TAAAG AC AC AATTTC ATTTCTAG AATGTTTCC AATCC ATTTAATTAATT 
TGAAGTAAAAGTAGTTGAAATTGGATTTGTGCAAACATGGAGAAATCTA 
AATTTAAAATTTTTATGCCACAAACCATTTCATCCAAATCAGATTTGTA 
AAGTGATTGCGGNCATTGGGTAATATGGCTTTCTTTTCTTTCC 



u 

X 



§ O 

o R 
So. 
§ 



GAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATAT 

CTTATCGCGGGATCCCCAAGGGTCCTTGTACTCCCTTCTTCCTGATCACAGTGTGAAGAAATACT 

TTGACCAAGTGGATATCTCCAATGGTTTGGATTGGTCCCTGGACCATAAAATCTTCT^ 

GACAGCCTGTCCTACACTGTGGATGCCTTTGACTATGACCTGCCAACAGGACAGATTTCCAACCG 

CAGAACTGTTTACAAGATGGAAAAAGATGAACAAATCCCAGATAGAATGTGCATTGATGTTGAGG 

GGAAGCTTGGGGTGGCCTGTTACAATGGAGGAAGAGTAATTCGCCTAGATCCTGAGACAGGGAAA 

AGACTGCAAACTGTGAAGTTGCCTGTTGATAAAACAACTTCATGCTGCTTTGGAGGGAAGGATTA 

CTCTGAAATGTACGTGACATGTGCCAGGGATGGGATGAGCGCCGAAGGTCTTTTGAGGCAGCCTG 

ATGCTGGTAACATTTTCAAGATAACAGGTCTTGGGGTCAAAGGAATTG 

GGGCGAATTCCAGCACACTGGCGGCCGTTACTAGTGGATCCGAGCTCGGACCAAACTTGATGCAT 
ACTTGAGTATTCTATAGTGNCACCTAAATAGCTTGGCGTAATCN 



5 S 
a) 2 



GNCACTATAGAATACTCAAGCTATGCATCAAGCTTGGTACCGAGCTCGGATCCACTAGTAACGGC 
CGCCAGTGTGCTGGAATTCGCCCTTGGACATCCGCATGAATGCTGTGTAACACACCCTGGGAGAG 
GACACCTCTTCCCAGCCACCTCTCTCAGCTCTGAAAAGCCCCACTGGACTCCTCCCCTCTAAGCC 
AAGCCTGATGAAGACACGGTCCTAACCACTATGGTGCCCAGACTCTTGTGGATTCCGACCACTTC 
TTTCCGTGGACTCTCAGACATGCTACCACATTCGATGGTGACACCACTGAGCTGGCCTCTTGGAC 
ACGTCAGGGAGTGGAAGGAGGGATGAACGCCACCCAGTCATCAGCTAGCTTCAGGTTTAGAATTA 
GGTCTGTGAGAGTCTGTATCATGTTTTTGGTAAGATCATACTACC^ 

AGCCTTCAATGTTCATGAATACATAAACCACCTAAGAGAAAACAGAGATGTCTTGCTAGCCATAT 
ATATTTTCTCGGTAGCATAGAATTCTATAGCTGGAATCTCCTAGAACCCTGTAACCCACGTGCTG 
CTGTGAGGTTAAGGAGGGAAGGTAAGGGCGAATTCTGCAGATATCCATCACACTGGCGGCCGCTC 
GAGCATGCATCTAGAGGGCCCAATTCGC 
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AATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCGCCC 
TTGAAGGGGTGCTTCAGGAAGGCATATGATGATCCGGGGAGAGAACATGTCCAAAATCCTGAAAG 
CACGGTCCATGATCACCAGGTGCTTTAGGGACCACTTCTTCGACAGAGGCTACTGTGAAGTAACC 
ACTCCAACACTGGTGCAGACACAGGTGGAAGGTGGGGCCACACTCTTCAAGCTTGACTATTTTGG 
GGAAGAAGCATTTTTGACCCAGTCCn^ACAGCTGTACCTGGAGACCTGCCTTCCAGCCCTGGGAG 
ATGTTTTTTGCATAGCCCAGTCTTACAGGGCTGAACAGTCCAGGACACGAAGGCATCTGGCTGAG 
TTCACTCACGTGGAAGCCGAGTGTCCTTTCCTGACCTTCGAGGACCTCCTGAGCCGTCTAGAGGC 
CCTGTGTTGTCTGTCTCAGGTAAGGGCGAATTCCAGCACACTGGCGGCCGTTACTAGTGGATCCG 
AGCTCGGTACCAAGCTTGATGCATAGCTTGAGTATTCTATAGTGTCACCTAAATAGCTTGGCGTA 
ATCATGGTCATAGCTGTTTCCTGTGTCAAATTGTTATCCGCTCAC 
CCGGAACATAAAGTGTAAAAGCTGG 



TCTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAAT 
TCGGCACGAGGCAGGCGGCTCGGACTGAGCAGGGCTTTCCTTGCCAGTGGATTGTGTAGAGTGTA 
CAGCCAGTCTCTTGTCTTCTGTCCAACATGGCATCTTCTGATATTCAGGTGAAAGAGCTGGAGAA 
GCGTGCTTCCGGCCAGGCTTTTGAGCTGATTCTCAGCCCTCGATCAAAAGAATCTGTCCCCGAGT 
TCCCCCTTTCCCCCCCCAAAGAAGAAGGATCTTTCCCTGGAGGAAATTCAGAAGAAATTAGAAGC 
TGCAGAAGAAAGACGCAAGTCTCATGAAGCAGAAGTCTTGAAGCAGCTCGCTGAGAAGCGGGAGC 
ATGAAAAAGAAGTGCTCCAGAAAGCCATTGAGGAGAACAACAACTTCAGCAAAATGGNAGAGGAG 
AAACTGACCCACAAAATGGAGGCTAACAAAGAGAACCGGGAGGCGCAAATGGCTGCCAAGCTGGA 
GCGTTTGCGAGAGAAGGACAAGCACGTTGAAGAGGTGCGGAAGAACAAAGAATCCAAAGACCCCG 
CGGACGAGACCGAGGCTGACTAAGTTGTTCCGAGAACTGACTTTCTCCCGACCCCTTCCTAAATA 
TTCANAGACTGTACTGGNGCAG 



u e 



CO 



TGCGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTC 
GCCCTTATCGCGGGATCCTCTGCAGCACAATTTAGGCCTTGGAGGAGCTGCTGTTGTCACCCTCT 
ACAGAATGGGTTTTCCCGAAGCTGCCAGCTCCTTCAGAACGCACCAGATTTCAGCTGCTCCCACC 
AGCTCTGCAGGH3GATGGATTCAAGGCAAATCTCATTTTTAAGGAAAT 
GGAAGGGGAAGAGTTCGTGAAGAAAATCGGTGGCATTTTTGCCTTC 

GGGGCAAAGAAGCTACGTGGGTGGTGGACGTGAAGAACGGCAAAGGATCGGTGCTTCCGGATTCA 
GATAAGAAGGCTGACTGCACAATCACCATGGCTGACTCAGACTTGCTGGCTTTCATGACTGGTAA 
AATGAACCCTCAGTCGGCCTTCTTTCAAGGTAAACTGAAAATTGCCGGTAACATGGG 
TGAAACTGCAAAGCCTGCAGCTTCAGCCGGACAAAGCTAAGCTGTGAAAGAGTCCCTTTGGCTC 
AGGGC C AAAGGGCGAATTC CAGC AC ACTGGC GGC CGTT ACT AGTGG ATCCGAGC TC GGTACCAAA 
CTTTGATGCATAGCTTGAGTATTCTATAGTGTCACCTAAATAGCTTGGCGTAATC 



TNTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAAT 
TCGGCACGAGGCTTCTGCCCTTGAGGTATCCAGGTACCATTGTGCACTGAGCAGGACTGAAAAGA 
ACAAAGTTTATTCTGGAACTAAACCTTCTTCCCTGAGACATCATGGCCCTGGCCCCAGAATTGAG 
CAGACAGACAAAACTGAAAGAGGTCGCAGGGATCCCACTGCGGGATTCAACTGTCGACAACTGGA 
GTCAGATTCAGACCTTCAAGGCGAAGCCAGATGACCTCCTCATCTGTACTTACCCTAAATCAGGG 
ACAACATGGATTCAAGAAATTGTCAACATGATTGAGCAGAATGGGGATGTAGAGAAGTGCCAGAG 
AACCATCATTCAACACCGACACCCTTTTATTGAGTGGGCTCGGCCACCCCAGCCTTCAGGTGTGG 
AC AAAGC C AATGCGATGCC AGCTCCAAGGATATT AAGG ACC CATCTTC CC ACTC AGCTGCTGC C A 
CCGTCTTTCTGGACAAACAACTGTAAGTACCTTTATGTGGCTC 

TTCCTTCTACCACTTCTACAGAATGTGCCAGGTGCTCCCCAATCGANGCACCTGGAATGAGTAOT 
TTGAAACCTTCATCAATGGAAAAGTAAGTTGTGGATC 



35 

8 
u 
a) 



AATTGGGCCCn^TAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCGCCC 
TTCGCGGGATCCGTGK^GGGCGTCATTCACTTC^ 

GTC AGGAC AGATTAC AGGATTAACTGAAGGCG AGC ATGGGTTC C ATGTC C ATCAATATGGGGACA 
ATACACAAGGCTGTACCACTGCAGGACCTCATTTTAATCCTCACTCTAAGAAACATGGCGGTCCA 
GCGGATGAAGAGAGGCATGTTGGAGACCTGGGCAATGTGGCTGCTGGAAAGGACGGTGTGGCCAA 
TGTGTCCATTGAAGATCGTGTGATCTCACTCTCAGGAGAGCATTCCATCATTGGCCGTACTATGG 
TGGTCCACGAGAAACAAGATGACTTGGGCAAAGGTGGAAATGAAGAAAGTACAAAGACTGGAAAT 
GCTGGAAGCCGCTTGGCTTGTGGTGTGATTGGGATTGCCCAAAAGCTTC 

AGCACACTGGCGGCCGTTACTAGTGGATCCGAGCTCGGTACCAAGCTTGATGCATAGCTTGAGTA 
TTCTATAGTGGCACCTAAATAGCTTG&CGTAATC 

ATCCGCTCACAATTCACACAACATACNANCCGGAAGCATAAAGTGNAAAGC 
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TTTNTGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAAT 
TCGCCCTTATCGCGGGATCCACAAGCACAGCCTCCCTGACCTGCCTTACGACTATGGCGCGCTGG 
AGCCGCACATTAACGCGCAGATCATGCAGCTGCACCACAGCAAGCACCACGCGACCTACGTGAAC 
AATCTGAACGTCACCGAGGAGAAGTACCACGAGGCGCTGGCCAAGGGAGATGTTACAACTCAGGT 
TGCTCTTCAGCCTGCACTGAAGTTCAATGGCGGGGGCCATATCAATCACAGCATTTTCTGGACAA 
ACCTGAGCCCTAAGGGTGGTGGAGAACCCAAAGGAGAGTTGCTGGAGGCTATCAAGCGTGACTTT 
GGGTCTTTTGAGAAGTTTAAGGAGAAACTGACAGCTGTGTCTGTGGGAGTCCAAGGTTCAGGCTG 
GGGCTGGCTTGGCTTCAATAAGGAGCAAGGTCGCTTACAGATTGCCGCCTGCTCTAATCAGGACC 
CACTGCAAGGAACCACAGGCCTTATTCCACTGCTGGGGATTGATGTGTGGGAGCACGCTTACTAT 
CTTCAGTATAAAAACGTCAGACCTGACTATCTGAAAAGCCATTTGGAATGTAATCAACTGGGAGA 

ATGTTAGCCAAAGAAN 



§ 

1 



ATTATAGAATACTCAAGOTATGCATCAAGCTTGGTACCGAGCTCGGATCCACTAGTAACGGCCGC 
CAGTGTGCTGGAATTCGCCCTTATCGCGGGATCCGTGCCCCGCTTTGACTGTGTACTCAAGTTGG 
TGCACCACTACATGCCGCCCCCAGGGGCCCCCCCCTTCTCTTTACCACCGACGGAACCCTCCTCT 
GAGGTTCAGGAGCAGCCACCTGCCCAGGCACTCCCCGGGGGTACCCCCAAGAGAGCTTACTACAT 
CTATTCTGGGGGCGAGAAGATCCCGCTGGTACTGAGCCGACCTCTCTCCTCCAACGTGGCTACCC 
TCCAGCATCTTTGTCGGAAGACTGTCAACGGTCACCTGGACTCCTATGAGAAAGTGACCCAGCTG 
CCTGGACCCATTCGGGAGTTCCTGKjACCAGTATGATGCTCCACTTTAAAGAGCAAAGAAAGGGTC 
AGAGGGGGCCTGGATCGGTCGCCTCTCCTCCGAGGCACATGGCACAAGCAAAAATCCAGCCCCAA 
TGGTCGGTAGCTCCCAGTTAGCCCAGCAGAAGATAGGCTTCTTCCTCAGGCCCTCCACTCCAAGC 
TTAAGGGCGAATTCTGCAGATATCCATCACACTGGCGGCCGCTCGAGCATGCATCTAGAGGGCCC 



o 



GANANNCNACAANNANGGNNANNGAGGNNNGNNNNNNCNNTCCCAGTC 

ATTCGGCCCTTATCGCGGGATCCCCTGTTCGAGCTGTTTGCAGNCAAAGTTCCAAAGACAGCAGA 
AAACTTTCGTGCTTTGAGCACTGGGGAGAAAGGATTTGGCATATAAAGGGTTC 
AATTATTCCAGGATTCATGTGCCAGGGTGGTGACTTCACACGCCATAATGGCACTGGTGGCAAGT 
CCATCTACGGAGAGAAATTTGAGGATGAGAACTTCATCCTGAAGCATACAGGTCCTGGCATCTTG 
TCCATGGCAAATGCTGGACCAAACACAAATGG1TCCCAGTTTTTTATCTGCACTGCCAAGACTC 
GTGGCTGGATGGCAAGCATGTGGTCTTTGGGAAGGTGAAAGAAGGCATGAGCATTGTGGAAGCCA 
TGGAGCGTTTTGGGTCCAGGAATGGCAAGACCAGCAAGAAGATCACCATCTCCGACTGTGGACAA 
CTCTAATTTCTTTGACTTGCGGGCATTTTACCCATCAAACCATTCCTTCTGTAGCT 
ACCCCCACCANNNNNGNNATNA 



Q -H 



TTCCAGTGTGATGGATATCTGCAGAATTNGCCCTTTCCGAAAGATAGGCTGCNAGGTC 
TGTCTGTGGACTCTCAGTTCACCCACCTGGCCTGGATCAATACCCCACGGAAGGAGGGAGGCTTG 
GGCCCACTGAATATCCCTATGCTTGNTGATGTGACTAAAAGCTTGTCCCAGAAOTACGGCGTGTT 
GAAAAATGATGAGGGCATCGCTTACAC^GGCCT^^TTATCATNGATGCCAAGGGTGTCCTTCGCC 
AGATCACAGTCAACGACCTACCTGTGGGACGCTCTGTANATGAGGCTCTCCGCCTTGTCCAGGCC 
TTTCAGTATACAGATGAGCATGGGGAAGTCTGTCCTGCTGGCTGGAAGCCCGGCAGTGACACCAT 
CAAACCCAATGTGGATGACAGCAAGGAATACTTCTCCNAACACAACTGAGATGGGTAAACATCGG 
TAGCCTGAATCCCGGATCTCACNTGCGCCCTTACCT 



n 3 



GAGACCCAAGCTTGGTACCGAGCTCGGATCCACTAGTAACGGCCGCCAGTGTGCTGGAATTCGCC 
CTTATCGCGGGATCCGACTGAGAGTCTTTTTCCCGCTCTGTGGAAAAGCCATTGAGATGAAATTC 
TTCGCAGACCGGGGCCACACTGTAGTTGGTGTAGAAA^ 

TGCAGAACAGAATCTGTCATACACAGAAGAACCGCTCACCGAAATTGCTGGTGCCAAAGTGTTTA 

AGAGTTCTTCCGGGAACATTTCCTTATACTGTTGCAGCATTTTCGACCTTCCCAGAG 

GGCAAGTTTGACAGGATTTGGGATAGAGGAGCATTGGTGGCTGTCAATCCAGGCGATCGTGACCG 

CTATGCAGATATAATACTGTCCCTGCTGAGAAGAGGGTATCACTACCTCCTGGTTGTCCTTTCTT 

ATGATCCAACAAAACACACAGGCCCGCCATTTTATGTTCCAGATGCTGAACTTAAAAAGTTATTT 

GGTACAAAATGCAACATGCAATGCCTTGAGGAGGTGGATGCTCTTGAAGAAAAGCTTGAAGGGCG 

AATTCGATATCGCGGCCGCTCGAGCATGCATCTAGAGGGCCCTATTCTATAGTGTCACCTAAATG 

CTAGAGCTCGCTTGATCAACCTCGACT 



o 

•H 



gncnaaaanggttataatgaaagtagtgaataatgataaaaaagggtanaattaatattttcatt 
gtcatntataatcanaggcagttgggtatagactctcncncanttcataggntatttttgnaaan 
taaaaaagnnacaggttttnacgnnntggagctggtnncactttccagagcatgattagncnaac 
tccgtaatagtggcttcgagcttttccttgttagcaccagagnactccccaaccttttgaccctt 
tanatananctgnaaggtcggcatgcatttgctcacagtcngcancaacatcctggcagtcatcc 
acgtggtacttcaagnaacaccacattggaatacntgtcacagagggaatgaaagaagggcttga 
tcattttgcaaggtccacaccacgtggcagagaagtccactaccacatagcttgtctcccgcagc 
ggccanggcctcctgaaaagcttccntgctctcgatcagcttcnccattttggctgttgcgggga 
gggagccncacgagtttcggcagaacccgatggaaatgg 
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NACOTAANAANAANGAATAANGNGGGAAAGATCAATANNNGNCNN^ 

Sagaa^ 

TNAGCGGCTTCTCCTGAGGAGGTTCCTGACCTCAGTCATCTCCAGGAAGCCTCCTCAGGGTCTGT 
GGGCTTCCCTCACCTCTACGAGCCTGCAGACCCCTCNGTACAATGCTGGTGGTCTAANNGGAACA 
CCCAGCCCTGCCCGGACATTTCACGCCACCAGAGTCTGTTCAACAACCTTTAACGTCCAGGATGG 
ACCTGACTTTCAANACAGAGTTGTCAACAGTGAGACACCAGTTGTCGTGGACTTTCATGCACAGT 
GGTATGGCC CCTGC AAGATC CTAGG AC CTCGGTC AG AG AAG ATGGTAGCC AAACAGCACGGGAAG 
GTGGTGATGGCCAAAGTGGACATTGACGATCACACAGACCTTGCCATTGAGTACGAGGTGTCTGC 
TGTGCNTACCGNGCTGGCCATCAAGAACGGGGANGTGGGGACAAGTTTGTGGGATCAAGGCGAAG 

NCCNNNNGTNTCAA 



GCGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCG 
CCCTTATCGCGGGATCCACGTCCTCCTGATTGTGCACTACCTGCTCCTCTCCGACAGTCCTGGCA 
CAGAGACGGCCTATTTTGCTTACCTCCTCTGCGTCTGCGTGAGCAGCGTGAGCTGCTGCATCGAC 
CCCTTGATTTACTACTATGCCTCCTCCGAGTGCC^^ 

AGAAAGCTCTGATTCCAACAGTTGCAACAGCACCGGCCAGCTGATGCCCAGTAAGATGGATACCT 

GCTCTAGCCACCTGAATAATAGCATATACAAAAAGCTACTAGCTTAGGGAAAGGGTGGCTGGAAG 

GTTCCATGAAGAAAAGGTTGGAAAGTGAACAGCTGGGGAACCCCCATCAGTCCCTGGCAAGAACT 

GTATTGACTTCAACGCCTTAAGAAAACCGCCAACGTCTGATTTGCATGCATACTTCTTACAAGTG 

CTATCAAGTGTATAGATTGGATAATCACCAGCAAGGTGATGGGAACGGAGTCAAGGTTTTCCA 

GTTAAGCTTGGGCCAAAGGGCCGAATTCCANCACACTGGCGGGCGNTACTAGTGGGATCCGAGCT 

CGGGACCN 



TTGCGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAAT 
CGCCCTTATCGCGGGATCCTGTGACCCCAACTCCCCAAGCTTTTGTCAATGCCCTGAGGGCTTCA 
TCCTGGACGAGGGTTCCATATGCACAGACATTGATGAGTGCAGTCAAGGCGAATGCCTCACCAAT 
GAATGTCGAAACCTTCCTGGCTCCTATGAGTGCATCTGCGGACCTGACACAGCCCTTGCTGGTCA 
GATTAGCAAGGACTGTGACCCCATCCCTGTTCTGGAGGACT^^ 

ACCCATCAAGCAATCCGACGGTAGTCTCTTCGACAGTTCCCCCTTCTGCAAGACCAATGCACTCT 
GGTGTGCTCATTGGGATCTCCATTGCCAGCCTGTCCCTGGTGGTGGCGCTTTTGGCGCT^ 
TCACCTGCGCAAGAAGCAGGGCACTGCTCGCGCAGAGCTGGAGTACAAGTGTACCTCTTCAGCCA 
AGGAGGTAGTACTGCAGCACGTGAGGACTGATCGGACGCTGCAGACTCGGGGCCAATAAGGGCGA 
AATTC C ANC AC ACTGGC GGNC GTT ACT AGTGGG A 



TTGGCGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAAT 
TCGCCCTTCGCGGGATCCGACAGGAAGGAGACCTGGGCCCAGTTTATGGATTCCAGTGGAGACAT 
TTTGGAGCAGACTACAAAGATATGGATTCAGATTACTCGGGTCAAGGAGTAGACCAGCTGCAAAA 
AGTGATTG AC AC C ATC AAAAC C AAC C CCG ATG AC AGAAG AATC ATC ATGTGTGCCTGGAAC CC AA 
AAGATCTTCCCCTGATGGCACTGCCTCCTTGCCATGCCCTCTGTCAATTT 

GAGCTGTCTTGC C AGCTTTAC C AGCGGTCAGGAG ATATGGGTCTGGGTGTGCCCTTC AAC ATTGC 

(^^TATGCTCTGCTGACCTACATGATTGCACATATCACGGGCCTGCAGCCGGGTGATTTTGTCC 

ATACTTTGGGAGATGCACACATTTATCTGAATCATATTGAGCCACTGAAAATTCAGCTACG 

GAACCAAGACCTTTCCCAAAGCTCAGAATCCTCCGAAAAGTTGAGACAATCGAAAGCTTGGCCAA 

GGGCGAATTCCAGCACACTGGCGGCCGTTACTAGTGGATCCGACTCGGTACCAAGCTTGATGCAT 

AGCTTGAGTATTCTATAGTGNCACCTAAATAGCTTGN 



NTTGTGGGAATTGTGAGCGGATACCATTTTCACACAGGAAACAGTTATGCCATGATT 
CTATTTAGGTGACACTATAGAATACTCAAGCTATGCATCAAGCTTGGTACCGAGCTCGGATCCAC 
TAGTAACGGCCGCCAGTGTGCTGGAATTCGCCCTTCGCGGGATCCGGAGTACCTGGAGCGCGAGC 
TCGGAACGAGAATCCACGAGTTGTAAGAAAATGGCAGACAAGCCGGACATGGGGGAAATCGCCAG 
CTTCGATAAGGCCAAGCTGAAGAAAACCGGGACGCAGGAGAAGAACACCCTGCCGACCAAAGAGA 
CCATTGAACAGGAAAAGAGGAGTGAAATCTCCTAAAAGCCTAGGAAGATTTCCCCACCCCACCCC 
TTCATCTCCGAGAACCCCCTCGTGATGTGGAGGAAGAGCCACCTGCAAGATGGACGCGAGCCACA 
AGCTGCACTGTGAACCCGGGCACTCCGCGCCGATCK:CACCGGCCCGTGGGTCTCTGAACX3GGACC 
CCCCCACTAATCGGACTGCCAAATTTCACCGGTTTGCCCAGGGATATTATAAGCTTGGCCAAGGG 
CGAATTCTGCAGATATCCATCACACTGGCGGCCGCTCGAGCATGCATCTAGAGGGCCCAATTCAC 

NC . 
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CCAAGCTATTTAGGTGCACTATAGAATACTCAAGCTATGCATCAAGCTTGGTACCGAGCTCGGAT 
CCACTAGTAACGGCCGCCAGTGTGCTGGAATTCGCCCTTATCGCGGGATCCGGGAACTGTGGAGT 
TTGCTCCTAGCTCAGAAAGACTCCCTOCATGGCCTGTCATTCCAGCTAATGCTTTGATTCCAACA 
CTAGCATCTGTCACTTTAGGACATACTGAACGGTACAAATTGATCAACACTACAGCACCTTTTGC 
ACAAAGCTTAAGATTGTGTATTCTACACGCGGGAAGACACTAGGTTGCCCAGGCAAAGCCAGTGG 
TCAGATGCCTTTCCTATAACCTGGGTGGGCTTTTGGAGAACCTTTGAGGAGTGATGCCATAGGCT 
CTAGAACAGGAAAGTGGGATTTGGGTGGACTTTTCCAACAGTTGTACTTTCGTAAATC 
GGGTTTTGTTTTTCCTTCTACTAGGTACTTTTGGAAGTTCAAAGTAACTC 
TTAAATGCAGGATATTTCTGCTTGGGACATCCTTGTGATTTGTACTTTATT^ 
TAACTGACAATGATGGGGATTGAACACTCGAGGGCAAGGGCGAATTCTGCAGATATCCATCACAC 
TGGCGGCCGCTCGAGCATGCATCTAGAGGGCCCAATTCCC 



PI 



CCAAGTGTGTTGGAATTCGCCCTTATCGC 

TATCATTGATAGCTTCCAGTAAAGCCTGTAGCTGTGCCCCAACCCACCCACAGACAGCTTTCTGC 
AACTCGGACCTGGTTATAAGGGCTAAATTCATGGGTTCCCCAGAAATCATCGAGACCACCTTATA 
C C AGC GTTATGAG ATC AAG ATG ACTAAGATGC TC AAAGGATTCGACGCTGTGGGAAATGC C AC AG 
GTTTCCGGTTCGCCTACACCCCAGCCATGGAGAGCCTCTGTGGATATGTCCACAAGTCCCAGAAC 
CGCAGCGAGGAGTTTCTCATCGCGGGCCGTTTAAGGMCGGAAATTTGCACATCACTGCCTG^ 

CTTCCTGGTTCCCTGGCATAATCTGAGCCCTGCTCAGCGAAAGGCCTTCGTAAAGACCTATAGTG 
CTGGCTGTGGGGTGTGCACAGTGTTTCCCTGTTCAGCCATCCCTTGCAAACTGG 
CATTGCTTGTGGACAGATCAGATCCTCGTGGGCTCTCAGAAGGGCTACCAGA 
CTGCCTGCCACGGAATCCAGATTTGTAAGCTTGGCCAAAGGGCGAATTCTGCAGATATC 
ACTGGCGGCCGCTCGAGCATGCATCTAGAGGGCCCAATTCGC 



CTCTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAA 
TTCGGCACGAGGGTGACCTGTGTATTGGCCCAGCAAAATC^ 

AATGGTTATACAGGGGCTTTCCAGTGCCTCGTTGAGAAGGGAGACGTAGCCTTTGTGAA 
GACTGTCCTGGAAAACACGAACGGAAAGAACACTGCTGCATGGGCTAAGGATCTGAAGCAGGAAG 

ACTTCCAGCTGCTGTGCCCTGATGGTACCAAGAAGCCTGTAACCGAGTTC 

GCCCAAGCTCCAAACCATGTTGTGGTCTCACGAAAAGAGAAGGCAGCCCGGGTTAGCACTGTGCT 

GACTGCCCAGAAGGATTTATTTTGGAAACGTGACAAGGACTGCACTGGCAATTTC 
GGTCTTCCACCAAGGACCTTCTGTTCAGAGATGACACCAAGTGTTTGACTAAACTTCCAGAAGGT 
ACCACATATGAAGAGTACTTAGGAGCAGAGTACTTGCAAGCTGTTGGAAACATAAGGAAGTGTTC 
AACCTCACGACTCCTAGAAGCCTGCACTTTCCACAAAAGTTAAAATCCAAGAAGTGGGTGCCACT 
GTGGTGGAGGAGGATGCCCCCGTGGATCCATGGGC 



CNCAAAANI^ANANNNTTNGGNAAACCCAGGGTTTTTNCC^ 

NNGGCCNAAOTAAAANTNACCCCCCCNAAAAN^^ 

TTTTTTTTTTTCCCNTTTONCCCCW 

TTTGGTTTAGAAACTGCTTATGATTAGTCTTCCAACAGTAACTCAAAAGCATGTAAAATAA 
CAACCTACTTTCTATATAAACACCCAGCAAACTGAGGCCCCTTGTCCACCTACCCAGGCTGGCTA 
GAGGTAAGGGAGGGCTTGGACAGCATTGAGTATAGATGCTTTACTGTGGTGATGGTGGGGGCAGC 
GCCTCTCTTCACACCACACAGGTTACCCTCGTGCCGAATTCTTGGCCTCGAGGGCCAAATTCCCT 
ATAGTGAGTCGTATTAAATTCGTAATCATGTCATANC 



GANNGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCGCCC 
TTATCGCGGGATCCAGGATTGGAGCACGTGTGGTGTCAGCAGTGTAGCCTTGGCATAAGCAGTTG 
TATAAAC TTTTC ATTACTGTAAC AAAGTGTC AGAGAC AATC AGCTTATAAAGAGGAAAAGTTTGT 
TTTGGGTGTTATGGTCCATTAGACGTCGTTCCTGTGGTTTAAGCCAGTGGTGA 
TGAG AGAATGTGGC AGAGC AAACTTC ATAAAGG ATGGAAAG AG AAAAAGTGAAGGGTTTAGAG AA 
AGAAGAAGAGAGAGATGGAATGAGAGAGGGAGAACACACACATGGGTACGCGCACACACACACAC 
ACACACACACACACACACACACACACACACACACACACACGGGGTGAGAGAGAGTCAGAGACAGA 
CAGAAAGATCCTACAGTACAGTTTAAGAGTTCATTACCAGGGTAAGAGAGAGGGCTCAGCAGTTA 
AGAGTACTCGCTGCTCTTCCAGAGGACCCTGCTTTGATTCCCAGCACCCACACTGTAAGCTTGGC 
CAAAAGGGCGAATTCCAGCACACTGGCGGCCGTTACTAGTGGATNCGAGCTCGGNCCCAAGCTTG 
ATGCATAGCTTGAGTATTCTATAGTGTCACCTTAAATAGCTTTGGCGTAATCATTGGNCA 



si 



^NGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATT 
CGCCCTTCGCGGGATCCGTAGGAGGCTCATGCGGGATTTCAAGCGATTGCAAGAGGACCCACCTG 
TGGGGGTCAGTGGTGCACCATCTGAAAACAACATCATGCAGTGGAACGCAGTTATATTTGGACCA 
GAAGGGACACCCTTTGAAGATGGTACTTTTAAACTAGTAATAGAATTTTCTGAAGAATATCCAAA 
TAAACCACCAACCGTTAGGTTTTTATCCAAAATGTTTCATCCAAATGTGTATGCTGACGGCAGCA 
TATGCTTAGACATCCTGCAGAACCGATGGAGCCCCACGTACGACGTCTCCTCCATCTTAACTTCA 
ATTCAGTCTCTGTTGGATGAGCCGAATCCAAACAGTCCGGCCAATAGCCAAGCAGCACAGCTTTA 
TCAGGAAAACAAACGGGAGTATGAGAAGAGGGTTTCGGC(^TTGTTGAGCAGAGC 
CATAATAGACACCTGGTCTGTCCACCTTTCCATCGTCGTTGTCAAGCITCGCCAAGC^GA^ 
CAGCACACTGGCGGCCGTTACTAGTGGATCCGAGCTCGGTACCAAGCTTGATGCATAGCTTGAGT 

ATTCTATAGTGCACCTAAATAGCTTGGCGTAA 
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Q 
C 

•H 
U 

•H 

S 

•H 

§ 


S 

s 


rCTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAG^T 
rcGGCACGAGGAGCTACTCACATATAAGCAGACATGAGGACTCTTTTCTCACTCGGCCTCTGACT 
^AGACATGGCTTCCTGCGTCTGTGTTGTCCGTTCGGAGCAATGGCCATTAATGACCTTTGACAC 
-ACCATGAGTGACAAAGTGAAGAAAATCAATGAGCATATTAGGTCCCAAACCAAGGTCTCTGTGC 
^GGACCAGATCCTTCTGCTAGACTCCAAGATCCTCAAGCCCCATAGAGCGTTGTCATCTTATGGG 
ATTGACAAGGAAAACACTATCCACCTCACCCTAAAGGTGGTGAAGCCCAGTGATGAAGAGCTGCC 
CTTGTCTCTGGTGGAGTCGGGCGACGAGGGGCAAAGGCACCTCCTTCGAGTTCGAAGATCCAGCT 
CCGTGGCCCAGGTGAAGGAAATGATCGAGAATGTGACCGCTGTGCCTCCCAAGAAGCAGATCGTG 
AATTGCAATGGAAAGAGGCTGGAAGATGGAAAGATCATGGCCGACTACAACATCAAGAGTGGTAG 

XTTCCTCTTTCTCACAGCGCACTGCATTGGGGGGTG 

AAAACCCGACTTCCTTTAATCAATTACCAATTGCATCTCTTGATGATATAAAAAAATAATG^ 


is ferase 




ANNNNNNNNNTNCTATGACATC 

AGGC C AAGAAATTCGGC ACGAGGCTAAACC CTTGCCC AAGGATATC 

CTGGAGAGCATGGCGTGGTGGTGTTTTCTCTGGGGTCAATGGTCAGCAGCATGACAGAAGAAAAG 

GCCAATGCAATTGCATGGGCCCTTGCCCAGATTCCACAAAAGGTTCTTT^ 

AACCCCAGCAACCTTAGGACCCAATACCAGAGTCTACAAGTGGCTTCCCCAGAATGACCTCCTTG 

GTCATCCAAAAACCAAAGCCTTTGTAACTCATGGTGGAGCCAATGGTGTCTATGAGGCCATCTAT 

CATGGAATCCCTATGGTTGGCATTCCTATGTTTGGAGAACTACATGATAACATTGCCCACATGGT 

GGCCAAAGGAGCAGCTGTTACACTGAATATCA 

TAAAGGAAATAATAAACAATCCATTCTATAAAAAAAATGCTGTGTGGTTGTCAACCATTCACCAT 
GACCAACCTATGAAGCCCCTGGACAAGGCTGTCTTCTGGATTGAGTTTGTCATGCGC 


UDP -glucuronosy 1 trai 


Y00156 


Uncoupling protein 2 


AB006613 | 


NGGGGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATT 

CGCCCTTCGCGGAATTCTGGCAGGAGCACCACAGGTGCCCTGGCTGTGGCTGTGGCCCAACCTAC 

AGATGTGGTAAAGGTCCGCTTCCAGGCCCAGGCCCGGGCTGGCGGTGGTCGGAGATACCAGAGCA 

CTGTCGAAGCCTACAAGACCATTGCACGAGAGGAAGGGATCCGGGGCCTCTGGAAAGGGACCTCT 

CCCAATCTTGCCCGAAATGCCATTGTCAACTGTACTGAGCTGGTGACCTATGACCTCATCAAAGA 

TACTCTCCTGAAAGCCAACCTCATGACAGACGACCTCCCTTGCCACTTCACTTCTGCCTTCGGGG 

CGGGCTTCTGCACCACCGTCATTGCCTCCCCCATTGATGTGGTCAAGACGAGATATATGAACTCT 

GCCTTGGGCCAGTACCACAGCGCCGGCCACTGTGCCCTGACCATGCTCCGGAAGGAGGGGCCCCG 

AGCCTTCTACAAGGGGTTCATGCCTTCCTTCCTCCGCTTGGGATCCTGGAACGTAAAGCT^ 

AAGGGCGAATTCCAGCACACTGGCGGCCGTTACTAGTGGATCCGAGCTCGGACCAACTTGATGCA 


Urokinase 
plasminogen 
activator receptor 


AF007789 1 


GCGAATTGGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAG 

CCCTTATCGCGGGATCCACCACCGAATGGCTTCCAATGTTACAGCTGTGAGGGGAACAGCACCTT 

TGGATGTTCCTACGAAGAGACGTCCCTCATTGACTCCCGGGGACCAATGAATCAGTGCTTGGAGG 

CTACAGGCTTAGATGTGCTGGGAAACCGGAGTTACACCGTAAGAGGCTGCGCCACGGCTTCCTGG 

TGCCAAGGTTCCCACGTGGCCGACTCCATCCAGACCCACGTCAACCTCTCTATCTCCTGCTGTAA 

TGGCAGTGGCTGTAACCGCCCTACAGGGGGCGCCCCCGGGCCAGGCCCTGCTCATCTTATCCTCA 

TTGCCTCCCTGCTCCTGACCCTCAGACTGTGGGGCATCCCTCTCTGGACCTGAATCCTGAGCCGT 

CTGCCCTGGCTGGACCCAGGGACTTTTGACCTCCTCCCCTCTGCTCCATCTTTGAGGACAGGCG 

GCTGTATTGTCTTCTTGGGGCCTCAAGAACTGGAAGAGAATTGAGAAAAGGGCTGCGGAAG 

GCCAAAAGGGCGAATTCCAGCACACTGGCGGCCGTO 

n n nv*r» arparsn toys Afyv ATTTT AT AfSTGNC ACCTAAATACTTGN 


Vacuole membrane 
protein 1 


1 AF411216 1 


TATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATTC 

GGCACGAGGAAATAAGCAGAGAGTTTTTATCTGCAGAAGTTAGCGTGGTGGAGCCTC 

CGGG AGGGCTCTC ACC AGGAAGGAAG ATC CC C ATTTCC AAC C TGTACTG ATTTTT AAAAC ATTTC 

TCCCTGAAAGCAGTTTAGTCCACATTTTACACTC 

TCTCCACCCTCGATTCAATCCACATTGTATTCCTTAGGGTGGATATGATC 
AACAAAACTGGCCTTCTGACACTTTCACAGGGCCACATGGTCCAACTGGAGAACCTCGGCCACAC 

AGAACCTTCTGACGTATGTTAAATATGCCAGGCTTTTCAGGCTTGTC 
TAAGTCACCAAATGTATATAAGTTATATATGTTGGATAGCAGTCTTGCATGCCT^ 

ATGTAATATGCCTTCCTTTCCCACCCTCAAAAAGGCCATT^ 
GGAAATTGATCTTTAAATTTTGAGACAGTATAAGGA 
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GAATTGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCGCCC 
TTCGCGGGATCCGAACTGCAGCCTCTTTCTCAAAATACAACACTCTCCTTCATGGCTACAAAAAT 
GGAAGATTCCGGCATTTATGTATGTGAAGGGATTAATGAGGCTGGAATTAGCAAAAAATCAGTTG 
AACTGATTATCCAAGGCTCTTCGAAGGACATAC^GCTTACAGCCTTCCCATCTAAGAGCGTCAAA 
GAGGGAGACACTGTCATTATCTCCTGTACTTGTGGAAGTGTGCCCGAAATATGGATAATTCTGAA 
AAAGAAAGTCAAGACAGGAGACATGGTGCTAAAGTCTGTTAATGGCTCGTACACCATCCGCAAGG 
CACAGCTGCAGGATGCCGGAGTATACGAGTGTGAATCGTAAACCGAAGTCGGCTCGCAGTTGCGA 
AGTTTAACACTTGATGTGAAAGGAAAAGAAAATAACAAGGACTATTTTTC 

ACTCTACTTTGCATCCTCCTTGGTAATACCCGCCATTGGGATGATCATTTAAGCTTGGCCAAGOT 
CGAATTCCAGCACACTGGCGGCCGTTACTAGTGGATCCGAGCTCGGACCAAGNTGNTGCATAGCT 
TGAGTATTCTATAGTGNCACCTAAA 



GCGAATTGGCCCTCTAGATGCATGCTCGAGCGGCCGCCAGTGTGATGGATATCTGCAGAATTCGC 
CCTTATCGCGGGATCCAGAGATGAGCTTCCTGCAGCATAGCAGATGTGAATGCAGACCAAAGAAA 
GATAGAACAAAGCCAGAAAATCACTGTGAGCCTTGTTCAGAGCGGAGAAAGCATTTGTTTGTCC^ 
AGATCCGCAGACGTGTAAATGTTCCTGCAAAAACACAGACTCGCGTTGCAAGGCGAGGCAGCTTG 
AGTTAAACGAACGTACTTGCAGATGTGACAAGCCAAGGCGGTGAGCCAGGCTGCAGGAAGGAGCC 
TCCCTCAGGGTTTCGGGAACTAGACCTCTCACCGGAAAGACCGATTAACCATGTCACCACCACAC 
CACCATCGTCACCGTCGACAGAACAGTCCTTAATCCAGAAAGCCTGACATGAAGGGAGAGGAAGC 
TTGGCCAAAAGGGCGAATTCCAGCACACTGGCGGCCGTTACTAGTGGATCCGAGCTCGGTACCAA 
GCTTGATGCATAGCTTGAGTATTCTATAGTGTCACCTAAATAGCTTGGCGTAATCATGGTCATAG 
CTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCATAAA 
GTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTAATTGCGT 



TCCTCATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGA 
ATTCGGC ACG AGGT AGATTTTGTTGAAGAAGTG AATGTTTACGGTGTGCC C GTGC C AGGTCATGA 
AGGTCGCATCGGGATGGCCTCGATCAAGATGAAAGAAAACTACGAGTTCAATGGAAAGAAACTCT 
TTCAGCACATCTCGGAGTACCTGCCCAGTTACTCGAGGCCTCGGTTCCTGAGAATACAAGATACC 
ATTGAGATCACCGGGACTTTTAAACACCGCAAAGTGACCCTGATGGAAGAGGGCTTTAACCCCTC 
AGTCATCAAAGATACCTTGTATTTCATGGATGACACAGAAAAAACATACGTGCCCATGACTGAGG 
ACATTTATAATGCCATAATTGATAAGACTCTGAAGCTCTGAATGTTGCCTGGCTCCTAACACTTC 
CAGAAAGAAACACAATAGGCCTAGCATAGCCCCTTCACATGTGTAATCCAACTTTAACTTGATTA 
AAGGTTATAGGTGTGATTTTTCCTAGGAAATTATTCATTTAAAGGACAATT^ 
TTGGTTTTTATTAATTACACCAGAACGTTTGCAAGTAAAAAGATTTAAAG^ 
TGTGCACCTGCCATTTGTCCTTGCAAACTTAACTTCTTGGAGAGAG 



5h 



TGATTACCCCC^^GCTATTTAGGTGCCCTATAGAATACTCAAGCTATGC^TCAAGCTTGGTACCG 
AGCTCGGATCCACTAGTAACGGCCGCCAGTGl^TGGAATTCGCCCTTATCGCGGGATCCGGCAG 
AGGAGAGAATCCAAATGCTGCTTTCTTTTATATAAATGC 

AACGTTATGCACTTTAAACAGTCAAGGTTAAGTATGATGTGGTTTATCAAATC^ 
TAGAAAACAAGGCTATCAAGCTGCTATAATGTGAAAAATCAAACTATCACATCTCCCACCTGCTT 
CAGGTCATCTGAGTGTTTTTTCTAGAGATTTGTTAATTTTGAGT^ 
TAAAAGGTTGTTCATATCAATGTAGAAATGAGTTTCACGCAAAG 

GTGTCTAGCACAGTAATGTTCAAGGTAATGGAGTGGTCGGGACACCTTCCCTTTCTGAAGCAGAG 
AAAACCATCTACAAGCGTGTGTGTGTTACTGCATCATCTGTGCGCTGGTAGAACAATGTTCCTAA 
GCACGACAACACTGATCGATAAGCTTGGCCAAAGGGCGAATTCTGCAGATATCCATCACACTGGC 
GGCCGCTCGAGCATGCATCTAGAGGGCCCAATTC 



TCTTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAA 

TTCGGCACGAGGGTGTTTTTTTTTTTTTT 
TCCTCTAGATTCCCTTCTTGTTTTCTCACCATCCCACGTGAGA 

ATCTTCTTTACCACTGAGGAAAGACAGAATCCTGCTAGAGGCCAGAAAGAATGTTCCAGACATGG 
ATGGAAGGCTCCTGACTGTTGACTTCAATGCCCCTGAAGGTAGGGAGTGCTCCAGGTCTGCCCCC 
AGGCTCCGAGGGTGGGTCTCCTAGGGGCTGGAAAATGCCCCACCAATCTGGCTAAGATAAGGAAA 
GGATATGAAGAGAAAGTTACAGAAACTTGAAGGGTAAAGCTAAGTCACTGAGAGAGTTATTGTAA 
GTTGCAGAGAAAATAAGTTGATGCGTGGTTCAGGGTCTGTGCAGAAAAGTGGACAGCACCTAATA 
GCTGTAGAAGAAGATGCAGAGAGACTGAAAAAAAAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAA 
C^CAGGAAAAAAAAAAAAAAAAAACACATGCGGNCGCAAGCTTATTCCCTTTAATGANGGTTAAT 
TTTACTTGGCNCTGGNCGCGTTTTTCAACGTCGTGACTGGG 
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CTATGACATGATTACGAATTTAATACGACTCACTATAGGGAATTTGGCCCTCGAGGCCAAGAATT 

CGGCACGAGGGTCTCCTTCACTTCGCCCTCCAGCCGCGGTGGCTGCAGCGCAACTTCCAGATAGC 

GGAGTGGCCTCAGCTGCGAGCCGAGCGGTGGCGGCAGCGCCCCTCAGGACACCCGCAGATCACCT 

TTTCCCCGCGACTTCGCCATGGCTGAATGTTGTGTACCGGTATGCCAACGGCCAATTTC 

TCCACCCTATGCTGACCTTGGCAAAGCTGCCAGAGATATTTTCAAC 

TGGTAAAGCTGGATGTGAAAACGAAGTCATGCAGTGGTGTGGAATTTTCAACATC 

AATACAGACACTGGTAAAGTCAGTGGGACCTTGGAGACCAAGTACAAATGGTGTGAGTATGGTCT 

GACTTTCACAGAGAAATGGAACACTGACAACACTCTGGGGACGGAGATTGCAATTGAAGACCAGA 

TTTGTCAAGGTTTGAAACTGACCTTTGACACCACGTTTTCACCAAACACAGGAAAGAAAAGTGGT 

AAAATCAAGTCTGCTTACAAGANGGAATGTATAAACCTTGGCTC 

TGGGACCTGCCATCCATGGGTCACCCGTNTTTGGNTACGGGGGGGGGGGGGGNG 
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AGGTGTGNAAATGACCGAGGAACTCAGATTTCTTGAATTGACTAAGACTTCACCAATGGGGTCAG 
AGGTAAACTTTGGTCGTGGGCGAAAGTTCNTCCGAACTGTCTGAAGA 

GTTCAGGCTATTGGTCTCTAAGTTATAATTAAAGCCGGAGCTGATCAGAGAGTCCTCTGGGGGAC 
TAGAAGAAATCTTCAGTTCTGATTCCTCCTTCCTCTCCCGTGCTAGAATGATTTTGGTCCACAGG 
TCTTCCTGGTTGTCAAATTTTATCTCAGAGGCTGACACGTAGCAGGGCTCACTCTGAAGATAGCG 
TTCCAACTCCAGGCAGGTCTGTTGCCAATATTCCTCCAGGGACGGCAGAGCCGAGAAGTAGCCCG 
TTTCGTGCACAATCTGTAGTTCCTGGAAGATGCTACACATTGGGAGCACATCCATGTCGGGTT^ 
AAAAGACAGTCCCCGCTGTCGGGAAAACAGGGAGGTGAACGATCAGGAGTCGGAGCAGAAACTGT 
TCCCGGGAGCGCAGGTGAAAGTTTCATGCAAACTGGATGGCGCTGCAATCGGACGCCGGGTCCGG 
ACCCTCCCGCAGCCCGCAGCGCGCCGAGCCCACGCAATATTTGCCTCGTGCCGAATTCTTGGCCT 
CGAGGGCCAAATTCCCTATAGTGAGTCGTATTAAATTCGTAATCATGTCATANNNNG 
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